Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add protobuf codegen decoder #12980

Merged
merged 2 commits into from
May 17, 2024
Merged

Add protobuf codegen decoder #12980

merged 2 commits into from
May 17, 2024

Conversation

rseetham
Copy link
Contributor

@rseetham rseetham commented Apr 22, 2024

resolves #12679

Design doc: https://docs.google.com/document/d/186mVLISgYEmljla3-IHrFNJ2dGMJ21qJ-MWFtsx-ayI/

Test resources have the generated code we expect for the different test proto files.

@codecov-commenter
Copy link

codecov-commenter commented Apr 22, 2024

Codecov Report

Attention: Patch coverage is 0% with 324 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (59551e4) to head (6ed7642).
Report is 1278 commits behind head on master.

Files with missing lines Patch % Lines
...n/inputformat/protobuf/codegen/MessageCodeGen.java 0.00% 178 Missing ⚠️
...ugin/inputformat/protobuf/ProtoBufSchemaUtils.java 0.00% 54 Missing ⚠️
...not/plugin/inputformat/protobuf/ProtoBufUtils.java 0.00% 54 Missing ⚠️
...format/protobuf/ProtoBufCodeGenMessageDecoder.java 0.00% 38 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (59551e4) and HEAD (6ed7642). Click for more details.

HEAD has 43 uploads less than BASE
Flag BASE (59551e4) HEAD (6ed7642)
integration 7 3
temurin 12 3
java-21 7 2
skip-bytebuffers-true 3 1
skip-bytebuffers-false 7 2
unittests 5 0
unittests1 2 0
java-11 5 1
unittests2 3 0
integration1 2 0
custom-integration1 2 0
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #12980       +/-   ##
=============================================
- Coverage     61.75%    0.00%   -61.76%     
=============================================
  Files          2436     2443        +7     
  Lines        133233   134535     +1302     
  Branches      20636    20824      +188     
=============================================
- Hits          82274        0    -82274     
- Misses        44911   134535    +89624     
+ Partials       6048        0     -6048     
Flag Coverage Δ
custom-integration1 ?
integration 0.00% <0.00%> (-0.01%) ⬇️
integration1 ?
integration2 0.00% <0.00%> (ø)
java-11 0.00% <0.00%> (-61.71%) ⬇️
java-21 0.00% <0.00%> (-61.63%) ⬇️
skip-bytebuffers-false 0.00% <0.00%> (-61.75%) ⬇️
skip-bytebuffers-true 0.00% <0.00%> (-27.73%) ⬇️
temurin 0.00% <0.00%> (-61.76%) ⬇️
unittests ?
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rseetham rseetham force-pushed the protbuf_codegen branch 2 times, most recently from 4615a6c to b64cfa1 Compare April 22, 2024 15:08
@rseetham rseetham marked this pull request as ready for review April 22, 2024 16:57
@rseetham rseetham changed the title Add protobuf codegen decoder with unit tests Add protobuf codegen decoder Apr 24, 2024
@rseetham
Copy link
Contributor Author

@KKcorps Could you take a look.

@chenboat
Copy link
Contributor

chenboat commented May 1, 2024

Can you link the overall design doc for this in the summary section?

public static final String NULLABLE_DOUBLE_FIELD = "nullable_double_field";
public static final String NULLABLE_FLOAT_FIELD = "nullable_float_field";
public static final String NULLABLE_BOOL_FIELD = "nullable_bool_field";
public static final String NULLABLE_BYTES_FIELD = "nullable_bytes_field";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the supported data types if pinot doesn't support the type natively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: We support all the protobuf descriptor types at present in the version we are using. If in the future new types get added, the codegen will skip that filed and log an error.
https://github.com/apache/pinot/pull/12980/files#diff-84564c2ff3bb09e23abe31143340f4e5a2aea0613588e1131311cb3ea17d4a5eR159-R177

@rseetham rseetham force-pushed the protbuf_codegen branch 4 times, most recently from 003bf3f to 8bf1d6d Compare May 10, 2024 18:34
Descriptors.FieldDescriptor keyFd =
fd.getMessageType().findFieldByName("key");
Descriptors.FieldDescriptor valueFd =
fd.getMessageType().findFieldByName("value");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

}
}

private static void addFieldToPinotSchema(Schema pinotSchema, FieldSpec.DataType dataType, String name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this code re-used from somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is from AvroUtils and JsonUtils. They both have this function:
AvroUtils
JsonUtils

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add it to a common package and use it. I can do that in a follow up pr.

@rseetham rseetham force-pushed the protbuf_codegen branch 2 times, most recently from 4ce5008 to 2ccb920 Compare May 13, 2024 17:21
Copy link
Contributor

@deemoliu deemoliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. can you ask for another stamp from Ting Chen

@rseetham rseetham force-pushed the protbuf_codegen branch 4 times, most recently from 80bad7c to 47b279b Compare May 15, 2024 15:54
@chenboat chenboat merged commit 5489d5a into apache:master May 17, 2024
22 checks passed
"format" : "1:MILLISECONDS:EPOCH",
"granularity" : "1:MILLISECONDS"
} ]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(format) Please add a extra empty line

@@ -50,6 +50,11 @@
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
</dependency>
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why we need to override this version? This can potentially cause version conflict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Only pinot-query-planner is using the org.codehaus.janino dependency. I's also using 3.1.6 version. There's no common version so I specified it here as well

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have pinned its version to 3.1.12 in #12724. Maybe you didn't pull the latest code when merging this

gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024
* Add protobuf codegen decoder with unit tests

* Add utils functions to convert protobuf schema to pinot schema with support for complex type schema
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support decoding protobuf input using code generation to call native methods
5 participants