-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Revert internals destruction and add test for internals recreation #5972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@XuehaiPan Is your plan to work on adding a new unit test here after you have the fix? — Short-term the fix is most important, long-term the test will be more important. I could try to use some LLM to generate a new test, based on what you provided under 5958. Do you think that'd be useful? |
|
It might be better to mostly revert #5958 if the behavior it has is not what we want. The problem described is expected (but undesirable) behavior from that PR. It is not a use-after-free. It's also possible for One option is to go back to something more like the first commit on #5958, before I went into releasing internals itself. The original goal of that PR was to get rid of the leaks of the |
|
I think releasing some part of the internal but keeping the internal leaked would be a safest choice. |
@XuehaiPan did you see my question above (https://github.com/pybind/pybind11/pull/5972#issuecomment-3786298624)? Subinterpreter support added a lot of complexity; if we don’t capture these issues in unit tests, pybind11 will regress over time. I’m very skeptical of fixes that don’t start from a failing unit test. |
Those leaks are currently dwarfed by CPython leaks documented here (pure‑C reproducer): Until those are fixed in CPython, or we find a workaround that sidesteps them, the savings from #5958 are marginal. If reverting #5958 is what it takes to ship v3.0.2, that seems like a reasonable short‑term compromise. |
This makes sense to me because most users only use the main interpreter. The leak will be freed anyway on program shutdown, while the correctness is more important.
I am trying to add a repro. It is hard to find a consistent repro for the order bugs since there is no guarantee. |
I added a test that fails on the master branch. But I don't think it can test all cases since there is no guarantee for the GC order. |
|
This looks great at first glance. I'll start understanding/reviewing this now with LLM assistance. |
|
Below is the result of me discussing this PR with Cursor (GPT-5.2 Codex High). I didn't study the code changes in this PR in detail. In my mind, these are the most important goals:
High level approach: pare this PR back, to more-or-less roll back 5958, but keep as much of the new unit test as possible. Based on that, Cursor came up with the Options A, B, C below. @XuehaiPan @b-pass, could you please scrutinize Cursor's analysis, does it make sense? Based on the analysis, Option A looks best to me. WDYT? PR 5972 internals version bump considerations (v3.0.2)Executive summaryFor v3.0.2 you want no internals version bump and no new ABI breaks, but you also want to
Nothing else in 5972 inherently requires a bump (no struct layout changes or ABI‑critical type So the decision point is:
Below are the minimal patch sets and recommendations. Root cause (why the crash happens)PR 5958 introduced What in 5972 is related to the crash fixCore behavioral change (the shutdown‑safety fix)These changes allow pybind11 types to hold a reference to the internals capsules, keeping them
This prevents Non‑essential or unrelated changesThese are not required for the crash fix and add risk for a patch release:
Mixed v3.0.1 / v3.0.2 (no bump) analysisIf you remove the bump, v3.0.1 and v3.0.2 modules will share internals (same capsule key). Implications:
Minimal patch sets for v3.0.2 (no bump)Option A (safest for mixed environments, no bump)Fully avoid destroying internals at shutdown (i.e., revert the risky part of 5958):
Pros
Cons
This is the only way to guarantee safety in mixed 3.0.1/3.0.2 processes without a bump. Option B (partial fix, no bump)Keep the shutdown behavior from 5958, but add the capsule‑ref protection from 5972: Keep
Drop
Pros
Cons
Option C (post‑release, with bump)After v3.0.2, keep the shutdown‑safety fix and bump internals version so old modules do not Recommended course (for v3.0.2)Given the requirement of full ABI compatibility and avoiding regressions for mixed builds, the
Then, after v3.0.2, do the larger cleanup:
Notes on test strategyThe current new test in 5972 uses a weakref callback to confirm the capsule is still alive at type
This directly targets the failure mode without enforcing a particular lifetime mechanism. Final answer to “what in 5972 requires an internals bump?”Strictly speaking, nothing in 5972 requires a bump. There is no ABI‑level change to the |
This reverts commit 8f25a25.
Prevent re-creation of internals after destruction during interpreter shutdown. If pybind11 code runs after internals have been destroyed, fail early with a clear error message instead of silently creating new empty internals that would cause type lookup failures. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
This PR is now ready for code review. |
|
This is a very complex PR. In an attempt to understand it, I worked with Cursor (GPT-5.2 Codex High) on a draft comprehensive PR description (below). This all makes sense to me high-level. @XuehaiPan @b-pass could you please check for correctness? @XuehaiPan could you please add the corrected version to the actual PR description? I'll look in more detail after the comprehensive PR description is posted. DRAFT comprehensive PR descriptionSummaryThis PR changes pybind11’s interpreter‑shutdown behavior to avoid failures when Importantly, the internals version bump has been reverted; the default Motivation / backgroundPR #5958 introduced Runtime behavior changes (net effect vs
|
rwgk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple very easy suggestions for changes to comments.
Description
Fixes https://github.com/pybind/pybind11/pull/5958/changes#r2717645230
Suggested changelog entry:
📚 Documentation preview 📚: https://pybind11--5972.org.readthedocs.build/