Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locking logic to get consistent table view for upsert tables #12976

Merged
merged 6 commits into from
May 22, 2024

Conversation

klsince
Copy link
Contributor

@klsince klsince commented Apr 19, 2024

This PR tries to support consistent table view for querying upsert tables. The consistency is at table partition level, which can contain many segments. Today, updates that involve two segments' bitmaps are not atomic, thus causing queries to see inconsistent table view, e.g. to return less than expected PKs.

The high level idea is either to synchronize the query threads and upsert threads (including consuming thread or HelixTaskExecutor threads) with locks for queries to get a consistent set of segments' validDocIds bitmaps; or let upsert threads keep a consistent set (i.e. upsert view) of bitmaps and refresh the view regularly. Both modes are added in this PR, as they can be wired up using one R/W lock.

Added a new upsert config consistencyMode, which can be NONE, SYNC or SNAPSHOT

  1. by default, the feature is disabled, i.e. NONE
  2. for SYNC mode, the upsert threads take the WLock when the upsert involves two segments' bitmaps; and the query threads take the RLock when getting bitmaps for all its selected segments. This is mode is best for data freshness, but queries may block data ingestion. Mainly for low qps or low ingestion cases.
  3. for SNAPSHOT mode, the query threads don't need to take lock when getting bitmaps. The query threads access a copy of bitmaps that are kept updated by upsert thread periodically. In this mode, if data freshness is concerned, the query can specify a query option as called upsertViewFreshnessMs (e.g. 3s) to set its tolerance on data freshness, and the query thread can fresh the bitmap copies immediately if they are not fresh enough. Mainly high qps and high ingestion cases.

@klsince klsince force-pushed the consistent_view_segment_context branch from 5e6acce to de05748 Compare April 19, 2024 21:28
@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2024

Codecov Report

Attention: Patch coverage is 74.21875% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 62.24%. Comparing base (59551e4) to head (97e79e7).
Report is 478 commits behind head on master.

Files Patch % Lines
...cal/upsert/BasePartitionUpsertMetadataManager.java 68.53% 15 Missing and 13 partials ⚠️
...e/pinot/common/utils/config/QueryOptionsUtils.java 0.00% 2 Missing ⚠️
...ata/manager/realtime/RealtimeTableDataManager.java 0.00% 1 Missing ⚠️
...t/ConcurrentMapPartitionUpsertMetadataManager.java 88.88% 1 Missing ⚠️
...psert/ConcurrentMapTableUpsertMetadataManager.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12976      +/-   ##
============================================
+ Coverage     61.75%   62.24%   +0.49%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2529      +93     
  Lines        133233   138375    +5142     
  Branches      20636    21416     +780     
============================================
+ Hits          82274    86132    +3858     
- Misses        44911    45803     +892     
- Partials       6048     6440     +392     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 62.21% <74.21%> (+0.50%) ⬆️
java-21 62.13% <74.21%> (+0.50%) ⬆️
skip-bytebuffers-false 62.22% <74.21%> (+0.48%) ⬆️
skip-bytebuffers-true 62.11% <74.21%> (+34.39%) ⬆️
temurin 62.24% <74.21%> (+0.49%) ⬆️
unittests 62.24% <74.21%> (+0.49%) ⬆️
unittests1 46.69% <7.81%> (-0.20%) ⬇️
unittests2 27.94% <66.40%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@tibrewalpratik17 tibrewalpratik17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @klsince for the fix as discussed in #12667.

@klsince klsince force-pushed the consistent_view_segment_context branch from de05748 to 1c5c503 Compare May 15, 2024 18:36
@klsince klsince changed the title [WIP] add locking logic to get consistent table view for upsert tables Add locking logic to get consistent table view for upsert tables May 15, 2024
@klsince klsince requested a review from Jackie-Jiang May 15, 2024 18:37
@klsince klsince force-pushed the consistent_view_segment_context branch from 5662b28 to 2746ddd Compare May 15, 2024 21:30
@Jackie-Jiang Jackie-Jiang added release-notes Referenced by PRs that need attention when compiling the next release notes Configuration Config changes (addition/deletion/change in behavior) labels May 18, 2024
@klsince klsince force-pushed the consistent_view_segment_context branch from 2746ddd to 9caf3c7 Compare May 20, 2024 22:44
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@klsince klsince force-pushed the consistent_view_segment_context branch from 9caf3c7 to 97e79e7 Compare May 21, 2024 23:18
@klsince klsince merged commit 429bb7a into apache:master May 22, 2024
19 checks passed
@klsince klsince deleted the consistent_view_segment_context branch May 22, 2024 03:59
gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024
…che#12976)

* added upsert config ConsistencyMode as the feature flag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) enhancement release-notes Referenced by PRs that need attention when compiling the next release notes upsert
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants