fix race condition in `ScalingThreadPoolExecutor` #13360

itschrispeck · 2024-06-11T03:01:04Z

I introduced ScalingThreadPoolExecutor previously to provide an autoscaling thread pool, used to prevent interrupts from corrupting the realtime Lucene index.

There is a race condition as the logic relied on _executor.getPoolSize() and _executor.getActiveCount(), and using the latter could lag the 'real' count of currently idle threads. In this case, the task would be queued and not executed. This PR changes the implementation slightly to track idle threads via overriding the two methods that may be used by ThreadPoolExecutor.getTask(), which is always executed by an idle thread to pick up the next task.

In theory the bug could cause sporadic timeouts for searches against the realtime Lucene index, though it is hard to reproduce. I came across this bug trying to use ScalingThreadPoolExecutor for another feature.

For testing, unit tests should cover this logic. The race condition is easily reproducible when the added unit test is used against the old implementation.

suggested tag: bugfix

codecov-commenter · 2024-06-11T03:37:40Z

Codecov Report

Attention: Patch coverage is 50.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 62.11%. Comparing base (59551e4) to head (47cd276).
Report is 608 commits behind head on master.

Files	Patch %	Lines
.../pinot/common/utils/ScalingThreadPoolExecutor.java	50.00%	3 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #13360      +/-   ##
============================================
+ Coverage     61.75%   62.11%   +0.36%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2548     +112     
  Lines        133233   139957    +6724     
  Branches      20636    21729    +1093     
============================================
+ Hits          82274    86938    +4664     
- Misses        44911    46432    +1521     
- Partials       6048     6587     +539

Flag	Coverage Δ
custom-integration1	`<0.01% <0.00%> (-0.01%)`	⬇️
integration	`<0.01% <0.00%> (-0.01%)`	⬇️
integration1	`<0.01% <0.00%> (-0.01%)`	⬇️
integration2	`0.00% <0.00%> (ø)`
java-11	`62.06% <50.00%> (+0.36%)`	⬆️
java-21	`61.99% <50.00%> (+0.37%)`	⬆️
skip-bytebuffers-false	`62.10% <50.00%> (+0.35%)`	⬆️
skip-bytebuffers-true	`61.95% <50.00%> (+34.22%)`	⬆️
temurin	`62.11% <50.00%> (+0.36%)`	⬆️
unittests	`62.11% <50.00%> (+0.36%)`	⬆️
unittests1	`46.70% <50.00%> (-0.20%)`	⬇️
unittests2	`27.71% <0.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pinot-common/src/main/java/org/apache/pinot/common/utils/ScalingThreadPoolExecutor.java

fix race condition

98eb469

Jackie-Jiang added the bugfix label Jun 11, 2024

Jackie-Jiang reviewed Jun 11, 2024

View reviewed changes

pinot-common/src/main/java/org/apache/pinot/common/utils/ScalingThreadPoolExecutor.java Outdated Show resolved Hide resolved

pinot-common/src/main/java/org/apache/pinot/common/utils/ScalingThreadPoolExecutor.java Outdated Show resolved Hide resolved

itschrispeck added 2 commits June 11, 2024 11:18

address comments

61b8bc7

typo

47cd276

Jackie-Jiang approved these changes Jun 11, 2024

View reviewed changes

Jackie-Jiang merged commit 36ce140 into apache:master Jun 11, 2024
18 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix race condition in `ScalingThreadPoolExecutor` #13360

fix race condition in `ScalingThreadPoolExecutor` #13360

itschrispeck commented Jun 11, 2024 •

edited

Loading

codecov-commenter commented Jun 11, 2024 •

edited

Loading

fix race condition in ScalingThreadPoolExecutor #13360

fix race condition in ScalingThreadPoolExecutor #13360

Conversation

itschrispeck commented Jun 11, 2024 • edited Loading

codecov-commenter commented Jun 11, 2024 • edited Loading

Codecov Report

fix race condition in `ScalingThreadPoolExecutor` #13360

fix race condition in `ScalingThreadPoolExecutor` #13360

itschrispeck commented Jun 11, 2024 •

edited

Loading

codecov-commenter commented Jun 11, 2024 •

edited

Loading