This document provides a preview of new features in Apache Pulsar 2.5.0, including transactional streaming, sticky consumers, batch receiving, and namespace change events. It also discusses messaging semantics like at least once, at most once, and effectively once delivery. Transactional streaming allows atomic multi-topic publishes and acknowledgments. Sticky consumers improve partitioning for key-based topics. Batch receiving allows consuming messages in batches. Namespace change events provide notifications of namespace changes.
5. Messaging semantics - 4
Limitations in effectively once
1. Only works with one partition producing
2. Only works with one message producing
3. Only works with on partition consuming
4. Consumers are required to store the message id and state for restoring
6. Streaming processing - 1
ATopic-1 Topic-2f (A) B
1
1. Received message A from Topic-1 and do some processing
7. Streaming processing - 2
ATopic-1 Topic-2f (A) B
2
2. Write the result message B to Topic-2
8. Streaming processing - 3
ATopic-1 Topic-2f (A) B
3
3. Get send response from Topic-2
How to handle get response timeout or consumer/function crash?
Ack message A = At most once
Nack message A = At least once
9. Streaming processing - 4
ATopic-1 Topic-2f (A) B4
4. Ack message A
How to handle ack failed or consumer/function crash?
10. Transactional streaming semantics
1. Atomic multi-topic publish and acknowledge
2.Message only dispatch to one consumer until transaction abort
3.Only committed message can be read by consumer
READ_COMMITTED
https://github.com/apache/pulsar/wiki/PIP-31%3A-Transaction-Support
22. Compatibility strategy evolution
Back Ward
Back Ward Transitive
version 2 version 1 version 0
version 2 version 1 version 0
can read can read
can read can read
can read
may can’t read
23. Evolution of the situation
7
Class Person {
@Nullable
String name;
}
Version 1
Class Person {
String name;
}
Class Person {
@Nullable
@AvroDefault(""Zhang San"")
String name;
} Version 2
Version 3
Can read
Can readCan’t read
24. Compatibility check
Separate schema compatibility checker for producer and consumer
Producer Check if exist
Consumer
isAllowAutoUpdateSchema = false
25. Upgrade way
BACKWORD
Different strategy with different upgrade way
BACKWORD_TRANSITIVE
FORWORD
FORWORD_TRANSITIVE
Full
Full_TRANSITIVE
Consumers
Producers
Any order
26. Produce Different Message
10
Producer<V1Data> p = pulsarClient.newProducer(Schema.AVRO(V1Data.class))
.topic(topic).create();
Consumer<V2Data> c = pulsarClient.newConsumer(Schema.AVRO(V2Data.class))
.topic(topic)
.subscriptionName("sub1").subscribe()
p.newMessage().value(data1).send();
p.newMessage(Schema.AVRO(V2Data.class)).value(data2).send();
p.newMessage(Schema.AVRO(V1Data.class)).value(data3).send();
Message<V2Data> msg1 = c.receive();
V2Data msg1Value = msg1.getValue();
Message<V2Data> msg2 = c.receive();
Message<V2Data> msg3 = c.receive();
V2Data msg3Value = msg3.getValue();
41. Ordering
Guaranteed ordering
Multi-tenancy
A single cluster can
support many tenants
and use cases
High throughput
Can reach 1.8 M
messages/s in a
single partition
Durability
Data replicated and
synced to disk
Geo-replication
Out of box support for
geographically
distributed
applications
Unified messaging
model
Support both
Streaming and
Queuing
Delivery Guarantees
At least once, at most
once and effectively once
Low Latency
Low publish latency of
5ms
Highly scalable &
available
Can support millions of
topics
HA
KoP Now