⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Feb 1, 2026

Motivation

  • Provide a config toggle to enable or disable dimension-table upsert/dedup logic so clusters can opt into queryable-doc-id filtering and upsert behavior for dimension tables.
  • Ensure upsert-related processing (computing/applying per-segment queryable doc id bitmaps and enabling segment upsert state) is only performed when the feature is explicitly enabled.

Description

  • Added an enableUpsert boolean to DimensionTableConfig (JSON property enableUpsert) and exposed isUpsertEnabled() in pinot-spi.
  • Read the new flag in DimensionTableDataManager and gate upsert-related logic behind _enableUpsert, including using queryable-doc-id snapshots when sizing/iterating segments and applying per-segment bitmaps.
  • Introduced a small RecordLocation type and helper methods applyQueryableDocIdsForRecordLocations, applyQueryableDocIdsForLookupTable, applyQueryableDocIdsToSegments, and getQueryableDocIdsSnapshot in DimensionTableDataManager to compute and apply per-segment MutableRoaringBitmap sets and call ImmutableSegmentImpl.enableUpsert(...) when appropriate.
  • Updated all test and helper call sites that construct DimensionTableConfig to pass the new flag, and added integration coverage that creates a small OFFLINE upsert dimension table and asserts deduplicated selection/count results (testDimensionTableUpsertSelection), as well as a unit test testLookupRespectsQueryableDocIds that verifies lookup respects queryable doc ids when upsert is enabled.

Testing

  • No automated test suites (mvn/CI) were executed as part of this change.
  • Added/updated tests include MultiStageEngineIntegrationTest.testDimensionTableUpsertSelection (integration) and DimensionTableDataManagerTest.testLookupRespectsQueryableDocIds (unit), but these tests were added and not run in this rollout.
  • Existing test usages and benchmark helpers were updated to construct the new config parameter where needed and compile-time imports were adjusted accordingly.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant