SlideShare a Scribd company logo
Building Zhaopin’s enterprise
Event Center on Apache Pulsar
Penghui Li
(@lipenghui6)
Jia Zhai
(@Jia_Zhai)
Zhaopin.com
Zhaopin.com is the biggest online recruitment service
provider in China
Zhaopin.com provides job seekers a comprehensive resume service, latest
deployment, and career development related information, as well as in-depth
online job search for positions throughout China.
Zhaopin.com provides professional HR services to over 2.2 million clients and its
average daily pageviews are over 68 millions.
Who we are
❏ Penghui Li
❏ Tech lead of Infrastructure
team at Zhaopin
❏ 5+ years of experiences in
messaging and microservices
❏ Apache Pulsar Committer
Who we are
❏ Jia Zhai
❏ Pulsar PMC Member / Committer
❏ BookKeeper PMC Member /
Committer
❏ Funding engineer at
StreamNative
Agenda
❏ Why building an Event Center
❏ Why Apache Pulsar
❏ Apache Pulsar at Zhaopin
❏ Streaming Platform
❏ Zhaopin’s contributions to Apache Pulsar
Why building an Event Center
Data Silos -> Unified Platform
Data Silos
❏ High Maintenance Cost
❏ Extremely hard to scale data cross
teams
❏ Inconsistency between data silos
❏ Doesn’t scale
❏ No consistent SLA
Pain Points
To Enterprises
MSMQ
Data Processing
Kafka
To End Users
RabbitMQ
Data Silos
❏ High Maintenance Cost
❏ Extremely hard to scale data cross
teams
❏ Inconsistency between data silos
❏ Doesn’t scale
❏ No consistent SLA
Pain Points
To Enterprises
MSMQ
Data Processing
Kafka
To End Users
RabbitMQ
Unification - MQService
❏ Simplified Operations
❏ Scale-out Service
❏ High Availability
Problem Solved
Problem Unsolved
❏ Keep messages for longer period
❏ Data rewind
❏ Order guarantee
Unification - MQService
Online Services
MQService
Data Processing
Kafka
Why building an Event Center
Why building an Event Center
Why building an Event Center
Why building an Event Center
Why Apache Pulsar
Pulsar == Messaging + Storage
Why Apache Pulsar?
Flexible Pub/Sub Messaging
backed by scalable log storage
Why Apache Pulsar / Multi Tenancy
Why Apache Pulsar / Queuing + Streaming
Why Apache Pulsar / Cloud Native Architecture
Why Apache Pulsar
Apache Pulsar at Zhaopin
20+ core services, 20 billions events/day
Unification - MQService
❏ No Data Silos
❏ Queue + Streaming
❏ Disaster Recovery
❏ Infinite Stream Storage
(via Tiered Storage)
❏ Data rewind
Problem Solved
Milestones
Core Metrics
❏ 50+ Namespaces
❏ 5000+ Topics
❏ 20+ billions events/day
❏ 5TB storage per day
❏ 20+ core services
System Metrics
Pulsar at Zhaopin
❏ One copy of data, single source-of-truth
❏ Don’t worry about data consistency between RabbitMQ and
Kafka
❏ Multi-tenancy makes topic management easier
❏ Strong data durability allows us to stop worrying about
message loss
Event Streaming Platform
Beyond Pub/Sub Messaging
Event Streaming Platform
Event Streaming Platform
❏ Pulsar Functions: lightweight computing
❏ Flink: streaming-first, unified data processing
❏ Pulsar SQL (presto): interactive queries on both historic and
real-time data
More details
Coming soon for Next ApacheCon :-)
Contribute to Apache Pulsar
The Apache Way
Zhaopin’s Contributions to Apache Pulsar
❏ Client Interceptors
❏ Dead Letter Topic
❏ Time Partitioned Message Tracker
❏ Service Url Provider
❏ Key_Shared Subscription
❏ Pulsar SQL Improvements
❏ Multi-versions Schema Support
❏ HDFS Offloader
Community
❏ Pulsar Website: https://pulsar.apache.org
❏ Twitter: @apache_pulsar / @streamnativeio
❏ Slack: https://apache-pulsar.herokuapp.com
❏ Mailing Lists
dev@pulsar.apache.org, users@pulsar.apache.org
❏ Github
https://github.com/apache/pulsar
❏ Medium
https://medium.com/streamnative
Thanks!

More Related Content

Building Zhaopin's enterprise event center on apache pulsar