⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@PastelStorm
Copy link
Contributor

@PastelStorm PastelStorm commented Feb 8, 2026

Note

Medium Risk
Touches core build/release infrastructure (dependency resolution, CI, Docker publishing, multi-arch images), so misconfigurations could break installs or releases despite minimal runtime code changes.

Overview
Migrates packaging and dependency management to uv by introducing pyproject.toml (+ dependency groups for lint/test) and uv.lock, removing the requirements/*.in|*.txt workflow, and updating Dependabot and the dependabot release workflow to bump via uv lock --upgrade.

Modernizes CI/Docker: all GitHub Actions jobs move to self-hosted runners, read Python from the new .python-version, and install/sync deps via uv with caching; Docker build/publish re-enables multi-arch (amd64+arm64) builds and manifests, and the main Dockerfile switches to bundling a pinned uv binary, installing deps via uv sync, and fetching pandoc directly.

Test reliability updates: Docker smoke tests are parallelized (one container per xdist worker/port) with container cleanup and retry logic for curl-based parallel-mode checks; OCR tests loosen brittle assertions to tolerate tesseract output variation. Also bumps version to 0.1.0 and removes obsolete config/files (e.g., setup.cfg, old Rocky Linux Dockerfile, pipeline version YAML).

Written by Cursor Bugbot for commit 9d13a9e. This will update automatically on new commits. Configure here.

@socket-security
Copy link

socket-security bot commented Feb 8, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedgithub/​astral-sh/​setup-uv@​e58605a9b6da7c637471fab8847a5e5a6b8df08199100100100100

View full report

@socket-security
Copy link

socket-security bot commented Feb 8, 2026

All alerts resolved. Learn more about Socket for GitHub.

This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored.

View full report

- name: Install uv
uses: astral-sh/setup-uv@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smoke test fails: bare pytest call after venv activation removed

High Severity

The old CI workflows ran source .venv/bin/activate before make docker-test, putting pytest on PATH. This PR removes that activation and uses uv sync instead, but docker-smoke-test.sh (line 83) still calls bare pytest — not uv run pytest. Since uv sync creates a .venv without activating it, and setup-uv only adds uv to PATH, pytest won't be found. Every other pytest invocation in the Makefile was correctly updated to use uv run, but this script was missed.

Additional Locations (1)

Fix in Cursor Fix in Web

@aballman
Copy link

@claude review thoroughly and examine edge cases and security considerations as well as any breaking changes

@claude
Copy link

claude bot commented Feb 10, 2026

Claude finished @aballman's task —— View job


Thorough Review of uv Migration PR ✅

I've completed a comprehensive review of this uv migration PR, examining security considerations, edge cases, and potential breaking changes.

Review Checklist:

  • Read repository guidelines and setup instructions
  • Analyze pyproject.toml configuration and dependency changes
  • Review GitHub Actions workflow modifications
  • Examine Dockerfile changes and security implications
  • Validate Makefile and build process updates
  • Address identified pytest issue in smoke test
  • Check for edge cases in dependency resolution
  • Assess potential breaking changes
  • Provide final security and compatibility assessment

🔒 Security Assessment: GOOD

Dependency Supply Chain: Socket security scan shows clean results with high scores (99-100) for the new astral-sh/setup-uv action
Container Security: Maintains use of Chainguard's hardened wolfi-base image
Binary Downloads: Pandoc downloads from official GitHub releases with proper architecture detection
No Secrets Exposure: No sensitive data in configuration files

⚠️ Issues Found

1. RESOLVED: pytest Issue in Smoke Test

Status: ✅ Already Fixed
The Cursor review flagged docker-smoke-test.sh line 83 for using bare pytest, but this was already corrected to use uv run pytest in commit a0a07d3.

2. Medium: Dependency Resolution Changes

Impact: Functional behavior changes possible

  • Starlette pinned to 0.41.2: Good - prevents middleware breaking changes
  • pdfminer-six upgraded to 20260107: Includes performance fixes but could change PDF parsing behavior
  • Version bump 0.0.93 → 0.1.0: Semantic versioning suggests potential API changes

3. Low: Build Reproducibility

Impact: CI/deployment reliability

  • Frozen lock file (uv.lock) ensures reproducible builds ✅
  • uv sync --frozen prevents accidental upgrades ✅
  • Cache dependency on uv.lock properly configured ✅

🚨 Potential Breaking Changes

Runtime Environment

  • Python 3.12 requirement: May break deployments on older Python versions
  • uv binary dependency: Adds new runtime requirement for container builds
  • Pandoc installation method: Changed from APK to binary download (version 3.9)

Build Process

  • Makefile targets: All updated to use uv run - requires uv installation
  • CI workflows: Removed virtualenv activation, relies on uv project management
  • Docker layers: Changed dependency installation method affects image caching

Development Workflow

  • Removed files: requirements/*.{in,txt}, setup.cfg - breaks workflows expecting these
  • New commands: uv sync --extra test replaces pip install -r requirements/test.txt
  • Lock file: Developers must use uv lock instead of pip-compile

🎯 Edge Cases Identified

Dependency Resolution

  • Constraint conflicts: uv.constraint-dependencies could conflict with transitive deps
  • Platform differences: Lock file includes markers for win32/emscripten but mainly Linux-focused
  • Optional dependencies: [all-docs] extra could pull unexpected packages

Container Environments

  • Multi-stage builds: uv binary copied from different image could have version mismatches
  • Architecture support: ARM64 support temporarily disabled, could cause issues if re-enabled
  • Path assumptions: .local directory structure differs from venv approach

CI/CD Pipeline

  • Cache invalidation: Changes to uv.lock invalidate all caches
  • Parallel jobs: Multiple jobs using same lock file could have race conditions
  • Rollback scenarios: Reverting would require reinstating requirements files

✅ Security & Quality Positives

  1. Deterministic builds: uv.lock with SHA256 hashes ensures reproducibility
  2. Faster installs: uv's resolver is significantly faster than pip
  3. Better error messages: uv provides clearer dependency conflict resolution
  4. Security scanning: Socket analysis shows all dependencies are clean
  5. Proper tooling: Ruff replaces black/flake8 with better performance

📋 Recommendations

Before Merge

  1. Test rollback procedure: Ensure you can revert if issues arise
  2. Update documentation: README and deployment guides need uv installation steps
  3. Verify ARM64 path: Understand why ARM64 is disabled and timeline for re-enabling

Post-Merge Monitoring

  1. Watch for parsing differences: pdfminer-six upgrade could affect PDF processing
  2. Monitor build times: uv should be faster, but verify CI performance
  3. Check error reporting: New dependency resolution might surface different issues

🎯 Final Assessment

Overall Risk: MEDIUM
This is a well-executed migration with proper attention to reproducibility and security. The main risks are operational (deployment processes need updating) rather than security concerns. The identified pytest issue was already resolved, and the dependency management improvements outweigh the migration complexity.

Recommendation: APPROVE with monitoring - This modernizes the build system appropriately while maintaining security posture.

@aballman
Copy link

aballman commented Feb 10, 2026

Verify ARM64 path: Understand why ARM64 is disabled and timeline for re-enabling

Anything I can do to help with this? We have ARM runners available, both Github hosted and self-hosted.

@aballman
Copy link

It would be cool to have this project use a .python-version file instead of defining the PYTHON_VERSION in workflow steps that require it. That way you don't end up with inconsistent versions passed to workflows in the future, since it would be just one place to update

- name: Set up Python ${{ env.PYTHON_VERSION }}
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install lint dependencies
run: uv sync --only-group lint --frozen
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI lint mypy loses all third-party type information

Medium Severity

The CI lint job uses uv sync --only-group lint --frozen, which only installs ruff, mypy, and types-requests — none of the project's runtime dependencies. When mypy then runs with --ignore-missing-imports, it silently skips all imports from fastapi, unstructured, pypdf, pandas, etc. Since the source code in prepline_general/api/ heavily depends on these packages (many of which ship with inline types / py.typed), mypy effectively checks almost nothing meaningful. The previous CI setup installed all dependencies before linting, giving mypy full access to third-party type information. This trade-off isn't explicitly documented and significantly weakens the type-safety net.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

line-length = 100
exclude = [
"prepline_*/api",
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruff exclude scope broader than original flake8 exclude

Low Severity

The exclude list at the [tool.ruff] level applies to both ruff check (linting) and ruff format (formatting). The old setup only excluded prepline_*/api from flake8 linting, while black still enforced formatting on that directory. Placing the exclude under [tool.ruff.lint] instead of [tool.ruff] would preserve the old behavior where the API directory is excluded from linting but still format-checked.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants