-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ArrayList instead of LinkedList in SortOperator #12783
Conversation
8be975b
to
ee9fc85
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #12783 +/- ##
============================================
+ Coverage 61.75% 62.16% +0.41%
+ Complexity 207 198 -9
============================================
Files 2436 2502 +66
Lines 133233 136586 +3353
Branches 20636 21145 +509
============================================
+ Hits 82274 84908 +2634
- Misses 44911 45401 +490
- Partials 6048 6277 +229
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java
Outdated
Show resolved
Hide resolved
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java
Outdated
Show resolved
Hide resolved
Theoretically, I am aligned on leveraging ArrayList for memory efficiency and cache friendliness if we know size upfront. Are you aware of any tools that can be used to test cache improvements ? Have you done some tests for the memory part via VisualVM / Yourkit ? PS - I am ok with the PR. Just curious in general about tools that can be used to test such PRs in future. |
We can easily create a JMH benchmark that tests list vs array list in this kind of code. I think it is not worth to create a JMH benchmark that test the operator itself because it would be quite complex. I'm not aware of other tools, but each linked list node contains a reference to its value and a reference to the next node in the link. That means we need at least 8 bytes (assuming compressed pointers) for each node. In theory these nodes may be spread in the heap, although my experience tell me that they are almost always to be one after the other in the heap because we allocate them by the same thread with no allocations in the middle. Future GCs may move them, but sounds pretty unlikely. While LinkedList requires 8 bytes per element that will probably but not for sure be contiguous in memory, Array list requires just 4 bytes per element that are going to be contiguous in memory for sure. Therefore cache usage is always better with ArrayList than with LinkedList. |
I've applied the suggested changes. Please take a look. |
A small PR that changes SortOperator to buffer entries in an ArrayList instead of a LinkedList.
In general LinkedList performance is horrible, even in cases when theoretically (by Big-O) they are fine, usually the performance cost is worse than ArrayList due to memory amplification and cache issues. In the specific case this PR is changing, there was no actual reason to use LinkedList apart from a slightly nicer code. But given the size of the final result is well know, it was very easy to change it to an ArrayList implementation where the ArrayList is initialized to all values being null and then set values in the reverse direction. Amortized BigO cost is still linear, but locallity and allocation should be quite better in this implementation.
A secondary but difficult to prove improvement is related to megamorphic calls. As we mostly always use ArrayList in our code, the JIT can generate code that assumes that. Sometimes we need to use different lists, but if we can avoid that we theoretically can produce better code.
Probably the performance cost will be negligible in most cases and some of the actual cases (like the megamorphic calls) cannot be easilly benchmarked with JMH, so I'm not adding these benchmarks in this PR.