Enhance Kinesis consumer #12806

Jackie-Jiang · 2024-04-07T23:36:04Z

Do not use a separate thread to fetch Kinesis records (this can fix the potential race condition)
Cache the shard iterator
Return the message batch immediately without combining multiple of them (timeout is ignored)
Change the default max records per fetch to 10,000 (Kinesis default)
Remove some unused dependencies

codecov-commenter · 2024-04-08T00:11:37Z

Codecov Report

Attention: Patch coverage is 72.72727% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 62.24%. Comparing base (59551e4) to head (a8a02e6).
Report is 465 commits behind head on master.

Files	Patch %	Lines
...e/pinot/plugin/stream/kinesis/KinesisConsumer.java	72.50%	8 Missing and 3 partials ⚠️
.../stream/kinesis/KinesisStreamMetadataProvider.java	50.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #12806      +/-   ##
============================================
+ Coverage     61.75%   62.24%   +0.49%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2527      +91     
  Lines        133233   138410    +5177     
  Branches      20636    21400     +764     
============================================
+ Hits          82274    86152    +3878     
- Misses        44911    45836     +925     
- Partials       6048     6422     +374

Flag	Coverage Δ
custom-integration1	`<0.01% <0.00%> (-0.01%)`	⬇️
integration	`<0.01% <0.00%> (-0.01%)`	⬇️
integration1	`<0.01% <0.00%> (-0.01%)`	⬇️
integration2	`0.00% <0.00%> (ø)`
java-11	`62.19% <72.72%> (+0.48%)`	⬆️
java-21	`62.12% <72.72%> (+0.50%)`	⬆️
skip-bytebuffers-false	`62.23% <72.72%> (+0.48%)`	⬆️
skip-bytebuffers-true	`62.08% <72.72%> (+34.36%)`	⬆️
temurin	`62.24% <72.72%> (+0.49%)`	⬆️
unittests	`62.23% <72.72%> (+0.49%)`	⬆️
unittests1	`46.73% <100.00%> (-0.16%)`	⬇️
unittests2	`27.93% <68.18%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

swaminathanmanish · 2024-04-08T21:34:37Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

+
+    // NOTE: Kinesis enforces a limit of 5 getRecords request per second on each shard from AWS end, beyond which we
+    //       start getting ProvisionedThroughputExceededException. Rate limit the requests to avoid this.
+    long currentTimeMs = System.currentTimeMillis();


Do we need our own custom rate limiter here ? Does kinesis client provide options to do the same thing/handle this, instead of us having this logic.

I didn't find one from Kinesis client. Seems it will just throw LimitExceededException.
The rps is currently configured on Pinot side though, so I guess it makes sense to rate limit on the Pinot side.

swaminathanmanish · 2024-04-08T21:43:48Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

    } else {
-      LOGGER.warn(message + ": " + throwable.getMessage());
+      // TODO: Revisit this logic to see if we always miss the first message when consuming from a new shard


Could you add more explanation to this ? Why would we miss the 1st message.

swaminathanmanish · 2024-04-08T21:46:13Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

+    // Read records
+    GetRecordsRequest getRecordRequest =
+        GetRecordsRequest.builder().shardIterator(shardIterator).limit(_config.getNumMaxRecordsToFetch()).build();
+    GetRecordsResponse getRecordsResponse = _kinesisClient.getRecords(getRecordRequest);


This can be empty right, even if the stream has some data, given how kinesis works? We'll be return a response even if its empty

We need some test to verify the behavior. The consumer can handle empty message batch, but the consumption lag might be set to 0 because it thought there is no more message. Added a TODO to revisit

Good point. We can also a metric to track this when it happens.

swaminathanmanish

LGTM other than clarifications.

swaminathanmanish · 2024-04-09T00:14:24Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

+    // Read records
+    GetRecordsRequest getRecordRequest =
+        GetRecordsRequest.builder().shardIterator(shardIterator).limit(_config.getNumMaxRecordsToFetch()).build();
+    GetRecordsResponse getRecordsResponse = _kinesisClient.getRecords(getRecordRequest);


Good point. We can also a metric to track this when it happens.

swaminathanmanish · 2024-04-09T00:21:43Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

+   * Kinesis enforces a limit of 5 getRecords request per second on each shard from AWS end, beyond which we start
+   * getting {@link ProvisionedThroughputExceededException}. Rate limit the requests to avoid this.
+   */
+  private void rateLimitRequests() {


Thanks for creating a separate method. I guess this being a special kind of rate limiter that needs to block until we are ready to fetch again, we cannot leverage off the shelf ones like guava.

if kinesis has a limit, don't we need to adhere to that limit. So does getRpsLimit() need to be what Kinesis limit is ?

Kinesis limit is not very straight forward, so I guess we need to iterate on this to get the best settings.

swaminathanmanish · 2024-04-09T00:23:02Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

+    long currentTimeMs = System.currentTimeMillis();
+    int currentTimeSeconds = (int) TimeUnit.MILLISECONDS.toSeconds(currentTimeMs);
+    if (currentTimeSeconds == _currentSecond) {
+      if (_numRequestsInCurrentSecond == _config.getRpsLimit()) {


This can be done later. A log.info or metric would help debug if rate limiting becomes an issue.

KKcorps · 2024-04-30T11:15:56Z

...tion/pinot-kinesis/src/main/java/org/apache/pinot/plugin/stream/kinesis/KinesisConsumer.java

+    // Get the shard iterator
+    String shardIterator;
+    if (startSequenceNumber.equals(_nextStartSequenceNumber)) {
+      shardIterator = _nextShardIterator;


will need to handle a case here when nextShardIterator has expired (since it has time limit of 5 minutes).

KKcorps

LGTM! Have tested it out and the changes work well for kinesis.

* Enhance Kinesis consumer * Simplify the handling * Address comments

Jackie-Jiang added enhancement ingestion bugfix refactor labels Apr 7, 2024

Jackie-Jiang requested a review from KKcorps April 7, 2024 23:36

Jackie-Jiang added the real-time label Apr 7, 2024

Jackie-Jiang removed the enhancement label Apr 8, 2024

Jackie-Jiang force-pushed the enhance_kinesis_consumer branch 2 times, most recently from cbb2faf to d588079 Compare April 8, 2024 18:27

swaminathanmanish reviewed Apr 8, 2024

View reviewed changes

Jackie-Jiang force-pushed the enhance_kinesis_consumer branch from d588079 to 597e8ee Compare April 8, 2024 22:24

swaminathanmanish reviewed Apr 9, 2024

View reviewed changes

Jackie-Jiang force-pushed the enhance_kinesis_consumer branch from 597e8ee to 1f5d462 Compare April 9, 2024 22:05

KKcorps mentioned this pull request Apr 26, 2024

Move offset validation logic to consumer classes #13015

Merged

KKcorps reviewed Apr 30, 2024

View reviewed changes

Jackie-Jiang force-pushed the enhance_kinesis_consumer branch from 1f5d462 to bd15dac Compare May 3, 2024 23:42

Jackie-Jiang force-pushed the enhance_kinesis_consumer branch from bd15dac to 4d71bf3 Compare May 11, 2024 05:50

Jackie-Jiang mentioned this pull request May 18, 2024

Add support to track if offset needs to honoured or not in Kinesis #13112

Open

Jackie-Jiang added 3 commits May 18, 2024 16:31

Enhance Kinesis consumer

3b323ae

Simplify the handling

c89b64c

Address comments

a8a02e6

Jackie-Jiang force-pushed the enhance_kinesis_consumer branch from 4d71bf3 to a8a02e6 Compare May 18, 2024 23:31

KKcorps approved these changes May 20, 2024

View reviewed changes

KKcorps merged commit 9e1246d into apache:master May 20, 2024
21 checks passed

Jackie-Jiang deleted the enhance_kinesis_consumer branch May 20, 2024 18:27

gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024

Enhance Kinesis consumer (apache#12806)

dc8eb66

* Enhance Kinesis consumer * Simplify the handling * Address comments

Jackie-Jiang mentioned this pull request Nov 15, 2024

[flakytest] timeout in fork most likely on RealtimeKinesisIntegrationTest #11135

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Kinesis consumer #12806

Enhance Kinesis consumer #12806

Jackie-Jiang commented Apr 7, 2024 •

edited

Loading

codecov-commenter commented Apr 8, 2024 •

edited

Loading

swaminathanmanish Apr 8, 2024

Jackie-Jiang Apr 8, 2024

swaminathanmanish Apr 8, 2024

Jackie-Jiang Apr 8, 2024

swaminathanmanish Apr 8, 2024

Jackie-Jiang Apr 8, 2024

swaminathanmanish Apr 9, 2024

swaminathanmanish left a comment

swaminathanmanish Apr 9, 2024

swaminathanmanish Apr 9, 2024

Jackie-Jiang Apr 9, 2024

swaminathanmanish Apr 9, 2024

KKcorps Apr 30, 2024

KKcorps left a comment •

edited

Loading

Enhance Kinesis consumer #12806

Enhance Kinesis consumer #12806

Conversation

Jackie-Jiang commented Apr 7, 2024 • edited Loading

codecov-commenter commented Apr 8, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swaminathanmanish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KKcorps left a comment • edited Loading

Choose a reason for hiding this comment

Jackie-Jiang commented Apr 7, 2024 •

edited

Loading

codecov-commenter commented Apr 8, 2024 •

edited

Loading

KKcorps left a comment •

edited

Loading