-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds AGGREGATE_CASE_TO_FILTER rule #12643
Conversation
Thanks to the test that was failing I've discovered the issue reported as #12647. I've changed the test to meet the actual semantics we are following right now, although we should fix that in the future |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #12643 +/- ##
============================================
- Coverage 61.75% 61.46% -0.29%
+ Complexity 207 198 -9
============================================
Files 2436 2452 +16
Lines 133233 133840 +607
Branches 20636 20766 +130
============================================
- Hits 82274 82265 -9
- Misses 44911 45413 +502
- Partials 6048 6162 +114
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
nice thanks!
Ie
can be transformed in
But
Should not be transformed because the ELSE case corresponds to a potential +1 in the count that cannot be expressed with the filter. It's implemented here: Just want to make sure I'll check myself. |
Could confirm that the use cases I mentioned above work thanks
passes thanks |
@cyrilou242 Given you created these tests... can you add them in a PR? Anyway remember that there is a bug related to FILTER expressions. See #12647 |
This PR adds a rule that transforms queries like
SELECT SUM(CASE WHEN col1 = 'a' THEN 1 ELSE 0 END) FROM a
intoSELECT SUM(1) FILTER (WHERE col1 = 'a') from a
.A test is added to verify explain is compatible, although in that case more optimizations are applied and we ended up executing something like
SELECT COUNT(col1) from a where col1 = 'a'
We think this optimization should be effective in most of the cases, specially applying indexes, although it may be some situations where the result is worse than expected because
FILTER
is sometimes slower thanCASE
. We plan to add in a future an option or hint to indicate Pinot to do not apply this (and other) optimizations.