Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Retention Manager Segment Lineage Clean Up #13232

Merged

Conversation

aishikbh
Copy link
Contributor

@aishikbh aishikbh commented May 26, 2024

Resolves #13171

Issue

When a lot of segment upload is happening on the controller, the segment lineage cleanup and deletion of segments fails very frequently. The segment upload path and idealstate update is serialized by PinotHelixResourceManager.assignTableSegment using a synchronized block by grabbing a lock on the table. In the retention manager, we do not grab the table lock while updating the idealstate during deletion of segments causing a lot of failures.

Solution

  • Serialize the idealstate update + segment delete path by a synchronized block while grabbing the lock on the table.
  • Change the default retry policy of segment lineage cleanup from exponential backoff to random delay.

* Changed the default retry policy of segment lineage cleanup from
  exponential backoff to random delay as this has proven to be more
successful in certain cases.

* Updation of idealstate in delete segment operation is done using
  grabbing a lock to the table for better resilience.
@codecov-commenter
Copy link

codecov-commenter commented May 26, 2024

Codecov Report

Attention: Patch coverage is 62.50000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 62.14%. Comparing base (59551e4) to head (948d6cd).
Report is 499 commits behind head on master.

Files Patch % Lines
...ntroller/helix/core/PinotHelixResourceManager.java 57.14% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13232      +/-   ##
============================================
+ Coverage     61.75%   62.14%   +0.39%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2534      +98     
  Lines        133233   139011    +5778     
  Branches      20636    21537     +901     
============================================
+ Hits          82274    86388    +4114     
- Misses        44911    46159    +1248     
- Partials       6048     6464     +416     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 62.11% <62.50%> (+0.41%) ⬆️
java-21 62.02% <62.50%> (+0.40%) ⬆️
skip-bytebuffers-false 62.13% <62.50%> (+0.39%) ⬆️
skip-bytebuffers-true 62.00% <62.50%> (+34.27%) ⬆️
temurin 62.14% <62.50%> (+0.39%) ⬆️
unittests 62.14% <62.50%> (+0.39%) ⬆️
unittests1 46.66% <ø> (-0.23%) ⬇️
unittests2 27.81% <62.50%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aishikbh aishikbh marked this pull request as ready for review May 27, 2024 07:33
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -55,7 +55,7 @@
*/
public class RetentionManager extends ControllerPeriodicTask<Void> {
public static final long OLD_LLC_SEGMENTS_RETENTION_IN_MILLIS = TimeUnit.DAYS.toMillis(5L);
private static final RetryPolicy DEFAULT_RETRY_POLICY = RetryPolicies.exponentialBackoffRetryPolicy(5, 1000L, 2.0f);
private static final RetryPolicy DEFAULT_RETRY_POLICY = RetryPolicies.randomDelayRetryPolicy(20, 100L, 200L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite aggressive. Not sure about its side effect

Copy link
Contributor Author

@aishikbh aishikbh May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for changing the retry policy is simply to make it more resilient along with serialising the update for idealState during deletion by retention manager.

currently when a lot of segments are being uploaded and the retention manger is trying to delete segments, the the retry is applied 5 times using exponential backoff and it fails.

For example :

2024/05/16 01:59:39.041 ERROR [RetentionManager] [pool-20-thread-4] Failed to clean up the segment lineage. (tableName = tableXXX_OFFLINE)
org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 5 attempts
        at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:65) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.manageSegmentLineageCleanupForTable(RetentionManager.java:195) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.processTable(RetentionMan

Copy link
Contributor Author

@aishikbh aishikbh May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay on further inspection, I do not see much value in adding random delays with the current parameters since the maximum wait time will be 200*20 = 4000ms where the total minimum delay now with exponential backoff is 2000 + 4000 + 8000 + 16000 + 32000 = 62000ms and the update is failing even then.

I think increasing the number of attempts in exponential delay should be sufficient to make it a bit more resilient. What do you think @Jackie-Jiang @snleee ?

Copy link
Contributor

@snleee snleee Jun 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main motivation:

  • Both retention manager & segment upload were both trying to update idealstate
  • Retention Manager was trying to update the idealstate without lock and used a simple exponential back-off
  • Segment upload side idealstate update is using DEFAULT_TABLE_IDEALSTATES_UPDATE_RETRY_POLICY = RetryPolicies.randomDelayRetryPolicy(20, 100L, 200L);
  • Segment upload side were updating idealstate after holding the lock

Observation

  • Retention Manager failed a lot of time after 5 attempts while most of idealstate updates via segment upload went through.

Goal

  • We would like to make ideal state update as robust as the segment upload path.

I thought about this briefly. I feel that as long as we hold the lock, retry logic may not matter too much. @Jackie-Jiang how do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging it and we can see how it works. Holding the lock itself might not be enough because there is no guarantee which controller is taking the segment upload

@Jackie-Jiang Jackie-Jiang merged commit 5da68d7 into apache:master Jun 9, 2024
19 checks passed
gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Retention Manager side segment lineage clean up
4 participants