Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle overflow for MutableOffHeapByteArrayStore buffer starting size #13215

Merged
merged 3 commits into from
Jun 8, 2024

Conversation

itschrispeck
Copy link
Collaborator

We've seen some state transition failrues from OFFLINE -> CONSUMING with the below stack trace. It looks like _startSize overflow isn't handled, so this change handles it.

It also treats 0 as an overflow case during buffer expansion, since that would hit the same precondition check as well.

java.lang.IllegalArgumentException: Illegal memory allocation -1710837370 for segment table__1__876__20240523T2008Z column table__1__876__20240523T2008Z:col.dict
  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:135)
  at org.apache.pinot.segment.local.io.readerwriter.RealtimeIndexOffHeapMemoryManager.allocate(RealtimeIndexOffHeapMemoryManager.java:78)
  at org.apache.pinot.segment.local.io.writer.impl.MutableOffHeapByteArrayStore$Buffer.<init>(MutableOffHeapByteArrayStore.java:99)
  at org.apache.pinot.segment.local.io.writer.impl.MutableOffHeapByteArrayStore.expand(MutableOffHeapByteArrayStore.java:193)
  at org.apache.pinot.segment.local.io.writer.impl.MutableOffHeapByteArrayStore.<init>(MutableOffHeapByteArrayStore.java:182)
  at org.apache.pinot.segment.local.realtime.impl.dictionary.StringOffHeapMutableDictionary.<init>(StringOffHeapMutableDictionary.java:45)
  at org.apache.pinot.segment.local.realtime.impl.dictionary.MutableDictionaryFactory.getMutableDictionary(MutableDictionaryFactory.java:48)
  at org.apache.pinot.segment.local.segment.index.dictionary.DictionaryIndexType.createMutableDictionary(DictionaryIndexType.java:449)
  at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.<init>(MutableSegmentImpl.java:321) 

tag: bugfix

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having 2GB as start size doesn't look correct. Can you check the high level logic and see if this is expected? Seems like we are trying to use one single buffer to hold everything?

@codecov-commenter
Copy link

codecov-commenter commented May 23, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 62.03%. Comparing base (59551e4) to head (a421c37).
Report is 536 commits behind head on master.

Files Patch % Lines
...l/io/writer/impl/MutableOffHeapByteArrayStore.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13215      +/-   ##
============================================
+ Coverage     61.75%   62.03%   +0.28%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2534      +98     
  Lines        133233   139321    +6088     
  Branches      20636    21535     +899     
============================================
+ Hits          82274    86434    +4160     
- Misses        44911    46401    +1490     
- Partials       6048     6486     +438     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 27.75% <80.00%> (-33.96%) ⬇️
java-21 62.04% <80.00%> (+0.41%) ⬆️
skip-bytebuffers-false 62.02% <80.00%> (+0.27%) ⬆️
skip-bytebuffers-true 61.99% <80.00%> (+34.26%) ⬆️
temurin 62.03% <80.00%> (+0.28%) ⬆️
unittests 62.03% <80.00%> (+0.28%) ⬆️
unittests1 46.55% <0.00%> (-0.34%) ⬇️
unittests2 27.76% <80.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@itschrispeck
Copy link
Collaborator Author

itschrispeck commented May 24, 2024

Having 2GB as start size doesn't look correct. Can you check the high level logic and see if this is expected? Seems like we are trying to use one single buffer to hold everything?

Looks like we're hitting an edge case. The contributing factors are:

  1. MV columns will always be dictionary encoded in the mutable segment
  2. We have a extremely large MV raw column generated by SchemaConformingTransformerV2
  3. Column is text indexed, so we use noRawDataForTextIndex config and final segment is not nearly as large

Together they can result in the estimated size based on RealtimeSegmentStatsHistory being extremely large even though our target segment size is ~1.2G.

I think the generic solution is to allow MV columns to be raw encoded even in the mutable segment. We could also use noRawDataForTextIndex to avoid storing the values entirely since the mutable index is now converted instead of rebuilding from scratch.

I'm not sure either of these should be in the scope of this PR. What do you think?

@@ -170,15 +170,20 @@ public void close()
private final int _startSize;

@VisibleForTesting
public int getStartSize() {
return _startSize;
public static int getStartSize(int numArrays, int avgArrayLen) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made static to avoid initializing huge buffers in unit tests

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to mix the scope of this PR and the general solution. The fix here is valid

@Jackie-Jiang Jackie-Jiang merged commit 9b75bff into apache:master Jun 8, 2024
17 of 19 checks passed
gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants