-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: prevent background merges on the realtime lucene index #13050
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #13050 +/- ##
============================================
+ Coverage 61.75% 62.17% +0.41%
+ Complexity 207 198 -9
============================================
Files 2436 2502 +66
Lines 133233 136658 +3425
Branches 20636 21179 +543
============================================
+ Hits 82274 84961 +2687
- Misses 44911 45397 +486
- Partials 6048 6300 +252
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I saw this test being flaky. Do you think it is related?
|
I don't think it's related, but is probably caused by this behavior: https://julien.ponge.org/blog/not-all-java-single-threaded-executors-are-created-equal-a-java-finalizer-horror-story/ Did you see the flakiness is Java 21? It should be fixed there, but could misbehave in the suites run with 11/17. Do you mind if I put the fix into another PR since it's not related to the fix for the reuse feature, or want me to bundle it into this one? (there's a couple other nits I'd rather bundle together if no preference). |
A general question: if we turn off merge, will it result in too many Lucene segments for realtime part and cause query perf issues? |
In theory it could, but in practice Pinot's typical segment sizes prevent this (>5-10GB segments isn't normal), and we don't see a query perf regression with even a small 10MB buffer size and 1.5GB segment size. FDs are probably more of a concern for systems w/ low limits, but I think can be addressed if needed w/ lucene buffer size/compound file format/Pinot segment size tuning - all of which have existing configs. |
#12744 changed the Lucene index build process to commit the realtime Lucene index, and reuse it in the immutable segment.
.commit()
is called at the beginning of segment conversion, and the Lucene index directory is copied to another location. SinceIndexWriter
can merge segments in the background even without insertions/updates to the index, these merges non-deterministically can trigger aFileNotFoundException
:By using
NoMergeScheduler
, the async merges will never occur. This patch was tested in our staging/prod env and the intermittent segment build failures disappeared.tag:
bugfix