[Enhancement] Optimize Pulsar Java client zlib compression performance on Java 11+ by passing direct buffers #23586
Labels
type/enhancement
The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Search before asking
Motivation
Here's an example of CompressionCodeZLib which has several opportunities for optimizations:
pulsar/pulsar-common/src/main/java/org/apache/pulsar/common/compression/CompressionCodecZLib.java
Lines 60 to 85 in 82237d3
Solution
The
java.util.zip.Deflater
class has contained methods for usingByteBuffer
input and output since Java 11.In the case of Java 11+, the code could be optimized.
Since the Pulsar Java client is Java 8+, using the
ByteBuffer
methods would require the use of reflection (unless a multi-release jar file is used with separate classes for Java 8 and Java 11). There's a reflection example in different situation in BookKeeper's Java9IntHash class.Regarding performance on Java 11+, the first problem is that it's using a heap buffer for the compressed buffer. A direct buffer would be more optimal when using the ByteBuffer methods with Deflater.
For Netty ByteBuf input, it's possible to achieve zero copy in most cases by using Netty ByteBuf's
nioBuffer
method. It's notable that usingnioBuffer
method will cause copies when the Netty ByteBuf input is a CompositeByteBuf. Netty doesn't have a good way for zero copy of CompositeByteBuf input. In BookKeeper, there's a solution for checksum calculation in the https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/util/ByteBufVisitor.java class, which can visit all buffer parts to avoid extra copies. A similar solution would be applicable to compression.Alternatives
No response
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: