[Bug] Redelivering messages doesn't take dispatcherMaxReadSizeBytes into account in Shared and Key_Shared subscriptions #23505

lhotari · 2024-10-23T06:35:13Z

Search before asking

I searched in the issues and found nothing similar.

Read release policy

I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

all released versions

Minimal reproduce step

Problem description:
In the Shared subscription, messages get added to the replay queue when a consumer disconnects. In Key_Shared subscription, the replay queue is also used when messages cannot be dispatched to a target consumer due to insufficient permits or when the hash is blocked.
There's a problem in the current implementation since the dispatcherMaxReadSizeBytes (default 5MB) setting isn't taken into account in the reads. The impact of this is that consumers will receive a large batch messages at once if the reads succeed. The exact implication of this isn't fully known at this time. However, it's against the design to ignore the dispatcherMaxReadSizeBytes setting which is helpful in making smaller incremental progress on individual dispatchers and service all active dispatchers in the broker one-by-one as fairly as possible.

What did you expect to see?

That dispatcherMaxReadSizeBytes is used for redelivering message.

What did you see instead?

dispatcherMaxReadSizeBytes is ignored.

Anything else?

No response

Are you willing to submit a PR?

I'm willing to submit a PR!

The text was updated successfully, but these errors were encountered:

ZhaoGuorui666 · 2024-10-23T12:57:31Z

Hi lhotari,

I have been looking into this issue today and noticed that you have self-assigned it. I would like to discuss the following points with you:

Where is the dispatcherMaxReadSizeBytes limit applied in the code?

Before inserting into the replay queue?
Within the readMoreEntries() method? (However, I noticed that the calculateToRead() method already imposes a limit at the beginning)

Looking forward to your PR and learning from your approach to resolving this issue.

Thank you!

lhotari · 2024-10-23T13:09:35Z

Hi lhotari,

I have been looking into this issue today and noticed that you have self-assigned it. I would like to discuss the following points with you:

Where is the dispatcherMaxReadSizeBytes limit applied in the code?

Before inserting into the replay queue?

Within the readMoreEntries() method? (However, I noticed that the calculateToRead() method already imposes a limit at the beginning)

Looking forward to your PR and learning from your approach to resolving this issue.

Thank you!

@ZhaoGuorui666 'll share details, possibly tomorrow. Just curious, are you facing this issue?

ZhaoGuorui666 · 2024-10-23T13:17:41Z

Hi lhotari,
I have been looking into this issue today and noticed that you have self-assigned it. I would like to discuss the following points with you:
Where is the dispatcherMaxReadSizeBytes limit applied in the code?

Before inserting into the replay queue?

Within the readMoreEntries() method? (However, I noticed that the calculateToRead() method already imposes a limit at the beginning)

Looking forward to your PR and learning from your approach to resolving this issue.
Thank you!

@ZhaoGuorui666 'll share details, possibly tomorrow. Just curious, are you facing this issue?

No, I'm just trying to solve an issue and want to contribute to Pulsar. I'm just starting out now, I'll take a look at your previous PRs and learn from them.

lhotari · 2024-10-23T13:49:05Z

No, I'm just trying to solve an issue and want to contribute to Pulsar. I'm just starting out now, I'll take a look at your previous PRs and learn from them.

@ZhaoGuorui666 One good way to start valuable contributions is to fix flaky tests. We have plenty of them: https://github.com/apache/pulsar/issues?q=is%3Aissue+is%3Aopen+flaky . In many cases, there could also be a production code issue that is causing the flakiness. You'll learn a lot of Pulsar while addressing flaky tests too. Usually you can reproduce a flaky test by temporarily replacing @Test with @Test(invocationCount = 20) so that the same method gets run multiple times. In certain cases, other methods are required for reproducing locally.

lhotari · 2024-10-23T13:51:58Z

@ZhaoGuorui666 for priority of flaky tests, you can check one of the recent reports in https://github.com/lhotari/pulsar-flakes. I triggered a new runs to get a reports of the most flaky tests in the last 2 weeks (runs: https://github.com/lhotari/pulsar-flakes/actions).

lhotari · 2024-10-23T13:53:43Z

list of flaky tests to address: https://github.com/lhotari/pulsar-flakes/tree/master/2024-10-23-14d-master

ZhaoGuorui666 · 2024-10-23T14:05:28Z

No, I'm just trying to solve an issue and want to contribute to Pulsar. I'm just starting out now, I'll take a look at your previous PRs and learn from them.

@ZhaoGuorui666 One good way to start valuable contributions is to fix flaky tests. We have plenty of them: https://github.com/apache/pulsar/issues?q=is%3Aissue+is%3Aopen+flaky . In many cases, there could also be a production code issue that is causing the flakiness. You'll learn a lot of Pulsar while addressing flaky tests too. Usually you can reproduce a flaky test by temporarily replacing @Test with @Test(invocationCount = 20) so that the same method gets run multiple times. In certain cases, other methods are required for reproducing locally.

Thank you for your suggestion. This method sounds very helpful.

lhotari added the type/bug The PR fixed a bug or issue reported a bug label Oct 23, 2024

lhotari added this to the 4.1.0 milestone Oct 23, 2024

lhotari self-assigned this Oct 23, 2024

lhotari mentioned this issue Nov 7, 2024

[fix][broker] Fix reading entries failed due to max in-flight reading #23524

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Redelivering messages doesn't take dispatcherMaxReadSizeBytes into account in Shared and Key_Shared subscriptions #23505

[Bug] Redelivering messages doesn't take dispatcherMaxReadSizeBytes into account in Shared and Key_Shared subscriptions #23505

lhotari commented Oct 23, 2024

ZhaoGuorui666 commented Oct 23, 2024

lhotari commented Oct 23, 2024

ZhaoGuorui666 commented Oct 23, 2024

lhotari commented Oct 23, 2024

lhotari commented Oct 23, 2024 •

edited

Loading

lhotari commented Oct 23, 2024

ZhaoGuorui666 commented Oct 23, 2024

[Bug] Redelivering messages doesn't take dispatcherMaxReadSizeBytes into account in Shared and Key_Shared subscriptions #23505

[Bug] Redelivering messages doesn't take dispatcherMaxReadSizeBytes into account in Shared and Key_Shared subscriptions #23505

Comments

lhotari commented Oct 23, 2024

Search before asking

Read release policy

Version

Minimal reproduce step

What did you expect to see?

What did you see instead?

Anything else?

Are you willing to submit a PR?

ZhaoGuorui666 commented Oct 23, 2024

lhotari commented Oct 23, 2024

ZhaoGuorui666 commented Oct 23, 2024

lhotari commented Oct 23, 2024

lhotari commented Oct 23, 2024 • edited Loading

lhotari commented Oct 23, 2024

ZhaoGuorui666 commented Oct 23, 2024

lhotari commented Oct 23, 2024 •

edited

Loading