You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Read release policy
I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.
Version
Linux, java 21.0.4, from apachepulsar/pulsar@sha256:51e92bc45ba495b9753585b12625579ddaeea8e02d08aecd336ceff26e1099f1
environment:zookeeper.version=3.9.2-e454e8c7283100c7caec6dcae2bc82aaecb63023, built on 2024-02-12 20:59 UTC
Pulsar Broker service; version: '4.0.0' Git Revision 92448d5
Client is also at 4.0.0
Minimal reproduce step
Create a partitioned topic with 5 partitions.
Add a subscription and 5 Key_Shared consumers, running in parallel.
Start a producer producing messages with 20 different keys.
Increase the number of partitions to 10.
Restart the producer.
What did you expect to see?
Each consumer should be in charge of a portion of the key space, once attributed the consumer-key should be stable.
What did you see instead?
The following sequence:
[Consumer 4] In charge of key 7 # first time Consumer 4 sees key 7
[Consumer 4] received MSG 7.1 (key 7)
[Consumer 4] received MSG 7.2 (key 7)
[Consumer 4] received MSG 7.3 (key 7)
...
<--- REPARTITION --->
<--- PRODUCER RESTART --->
[Consumer 4] received MSG 7.10 (key 7)
[Consumer 3] In charge of key 7 # oops... consumer 4 is still ongoing
[Consumer 3] received MSG 7.11 (key 7)
[Consumer 3] received MSG 7.12 (key 7)
[Consumer 3] received MSG 7.13 (key 7)
[Consumer 4] received MSG 7.14 (key 7)
[Consumer 4] received MSG 7.15 (key 7)
...
Anything else?
As @lhotaripointed out, this could be solved by offering a more advanced automated procedure for re-partitioning. While shutting down all producers and waiting for all messages to be consumed would solve the problem, I wonder how applicable this would be in real life...
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
This isn't a bug since in Pulsar we don't provide a guarantee for retaining message order in re-partitioning.
This issue isn't directly related to Key_Shared. When the number of partitions change, the assignment on keys to specific partitions will change. Before changing the number of partitions, you will first have to stop all producers and consume all messages. After this you would stop consumers. Then you can change the number of partitions, start the consumers and finally the producers.
This issue report could be changed into an enhancement request about key-ordered message delivery during re-partitioning. One of the initial ways to address this is to document how to handle such cases as I described above. There could be advanced support for handling this in an automated way.
Having a feature for pausing message delivery (dispatching) and message publishing on the broker side would be useful for creating an automated solution for maintaining key-ordering during re-partitioning. There has been a discussion about a feature for pausing message delivery on the broker side on Apache Pulsar Slack.
Search before asking
Read release policy
Version
Linux, java 21.0.4, from
apachepulsar/pulsar@sha256:51e92bc45ba495b9753585b12625579ddaeea8e02d08aecd336ceff26e1099f1
environment:zookeeper.version=3.9.2-e454e8c7283100c7caec6dcae2bc82aaecb63023, built on 2024-02-12 20:59 UTC
Pulsar Broker service; version: '4.0.0' Git Revision 92448d5
Client is also at
4.0.0
Minimal reproduce step
Create a partitioned topic with 5 partitions.
Add a subscription and 5
Key_Shared
consumers, running in parallel.Start a producer producing messages with 20 different keys.
Increase the number of partitions to 10.
Restart the producer.
What did you expect to see?
Each consumer should be in charge of a portion of the key space, once attributed the consumer-key should be stable.
What did you see instead?
The following sequence:
Anything else?
As @lhotari pointed out, this could be solved by offering a more advanced automated procedure for re-partitioning. While shutting down all producers and waiting for all messages to be consumed would solve the problem, I wonder how applicable this would be in real life...
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: