SlideShare a Scribd company logo
Pulsar Architectural Patterns for CI/CD
Every pattern shown here has been developed and implemented with my
team at Overstock
Email: dbost@overstock.com
Twitter: DevinBost
LinkedIn: https://www.linkedin.com/in/devinbost/
By Devin Bost, Senior Data Engineer at Overstock
Data-Driven CI/CD Automation for Pulsar Function Flows and Pub/Sub
+
Includes on-prem, AWS, and GCP architectures
Legend & Referenced Technologies
Pulsar Beam
Pulsar Topic
AWS CodePipeline
Pulsar Brokers
Kubernetes
Golang
Amazon S3
CouchDB
ReactJS
Docker
AWS IAM
GCP Cloud Build
GCP IAM
GCP Cloud Storage
Google Cloud Functions
Pulsar Function
Flink Job
Sonotype Nexus
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Data + Contact = Modular design
+
Modular Design
Reusable functions
Modular Design
Reusable functions
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Might need to manually satisfy contract at first
Might need to manually satisfy contract at first
Might need to manually satisfy contract at firstUntil you can get to where the data is originated
Build tool Artifact Storage
Build data
Build tool Artifact Storage Storage data
(1)
(2)
Filter to
artifact data
Store
Filter to
artifact data
Store
Push to gate
keeping system
Push to gate
keeping system
Push to deployment
pipeline for desired
environment
Push to deployment
pipeline for desired
environment
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost
{
"type": "function",
"artifactPathOrUrl": "http://path-to-artifact/example-ignite-function-1.0.1-20200125.003935-3-
jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction-backfill”,
"className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"userConfig": {
"username": "igniteUser",
"password": "exampleHashedPass",
"cache_name": "example-ignite-cache-backfill”,
"hosts_with_ports": "igniteserver1.domain.com:10800, igniteserver2.domain.com:10800,
igniteserver3.domain.com:10800, igniteserver4.domain.com:10800
},
"inputs": [
"persistent://feeds/exampleProject/data-to-dump-into-ignite-backfill”
],
"output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite-backfill”,
"logTopic": "persistent://public/default/function-log-topic-backfill”
}
Using the Java Admin API to consume from a Pulsar topic
Pulsar REST
Admin API
Consumer/Producer
{
"type": "function",
"artifactPathOrUrl": "http://path-to-artifact/example-ignite-
function-1.0.1-20200125.003935-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className":
"com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"userConfig": {
"username": "igniteUser",
"password": "exampleHashedPass",
"cache_name": "example-ignite-cache",
"hosts_with_ports": "igniteserver1.domain.com:10800,
igniteserver2.domain.com:10800, igniteserver3.domain.com:10800,
igniteserver4.domain.com:10800
},
"inputs": [
"persistent://feeds/exampleProject/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-
enriched-from-ignite",
"logTopic": "persistent://public/default/function-log-topic"
}
Pulsar Brokers
via Java
Admin API
More direct, faster, cleaner, and half the code volume
Pulsar REST
Admin API
Consumer/Producer
{
"type": "function",
"artifactPathOrUrl": "http://path-to-artifact/example-ignite-
function-1.0.1-20200125.003935-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className":
"com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"userConfig": {
"username": "igniteUser",
"password": "exampleHashedPass",
"cache_name": "example-ignite-cache",
"hosts_with_ports": "igniteserver1.domain.com:10800,
igniteserver2.domain.com:10800, igniteserver3.domain.com:10800,
igniteserver4.domain.com:10800
},
"inputs": [
"persistent://feeds/exampleProject/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-
enriched-from-ignite",
"logTopic": "persistent://public/default/function-log-topic"
}
Pulsar Brokers
Higher-availability option
Consumer/Producer
Consumer/Producer
Consumer/Producer
Pulsar REST
Admin API
{
"type": "function",
"artifactPathOrUrl": "http://path-to-artifact/example-ignite-
function-1.0.1-20200125.003935-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className":
"com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"userConfig": {
"username": "igniteUser",
"password": "exampleHashedPass",
"cache_name": "example-ignite-cache",
"hosts_with_ports": "igniteserver1.domain.com:10800,
igniteserver2.domain.com:10800, igniteserver3.domain.com:10800,
igniteserver4.domain.com:10800
},
"inputs": [
"persistent://feeds/exampleProject/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-
enriched-from-ignite",
"logTopic": "persistent://public/default/function-log-topic"
}
Pulsar Brokers
via Java Admin API
via Java Admin API
via Java Admin API
Fast-deploy
Pulsar REST
Admin API
{
"type": "function",
"artifactPathOrUrl": "http://path-to-artifact/example-ignite-
function-1.0.1-20200125.003935-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className":
"com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"userConfig": {
"username": "igniteUser",
"password": "exampleHashedPass",
"cache_name": "example-ignite-cache",
"hosts_with_ports": "igniteserver1.domain.com:10800,
igniteserver2.domain.com:10800, igniteserver3.domain.com:10800,
igniteserver4.domain.com:10800
},
"inputs": [
"persistent://feeds/exampleProject/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-
enriched-from-ignite",
"logTopic": "persistent://public/default/function-log-topic"
}
Pulsar Brokers
Or, as a Pulsar function
Deploy to test Deploy to prod
fast-deploy-go
Test Pulsar REST Admin API Prod Pulsar REST Admin API
fast-deploy-go
Router
The Router Function
Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name.
The router then gets the value of this key in the message and creates a destination topic name from the value.
{
"type": "function",
"artifactPathOrUrl": "http://pulsar/reusable-functions/generic-router-function-1.0.1-8-jar-with-dependencies.jar",
"tenant": "ops",
"namespace": "deployment",
"name": "pubSubConfigDeploymentRouter",
"className": "com.yourcompany.pulsar.functions.GenericRouterFunction",
"userConfig": {
"key": "environment",
"tenant": "ops",
"namespace" : "deployment-automation"
},
"inputs": [
"persistent://ops/deployment/pre-deployment-configs-output"
],
"logTopic": "persistent://ops/deployment/pubSubConfigDeploymentRouter-log"
}
Creates /ops/deployment-automation/[environment]
The Router Function
Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name.
The router then gets the value of this key in the message and creates a destination topic name from the value.
{
"type": "function",
"artifactPathOrUrl": "http://pulsar/reusable-functions/generic-router-function-1.0.1-8-jar-with-dependencies.jar",
"tenant": "ops",
"namespace": "deployment",
"name": "pubSubConfigDeploymentRouter",
"className": "com.yourcompany.pulsar.functions.GenericRouterFunction",
"userConfig": {
"key": "environment",
"tenant": "ops",
"namespace" : "deployment-automation"
},
"inputs": [
"persistent://ops/deployment/pre-deployment-configs-output"
],
"logTopic": "persistent://ops/deployment/pubSubConfigDeploymentRouter-log"
}
Creates /ops/deployment-automation/[environment]
The Router Function
Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name.
The router then gets the value of this key in the message and creates a destination topic name from the value.
Creates /ops/deployment-automation/[environment]
{
"type": "function",
"artifactPathOrUrl": "http://pulsar/reusable-functions/generic-router-function-1.0.1-8-jar-with-dependencies.jar",
"tenant": "ops",
"namespace": "deployment",
"name": "pubSubConfigDeploymentRouter",
"className": "com.yourcompany.pulsar.functions.GenericRouterFunction",
"userConfig": {
"key": “generator-type”,
"tenant": "ops",
"namespace" : "deployment-automation"
},
"inputs": [
"persistent://ops/deployment/pre-deployment-configs-output"
],
"logTopic": "persistent://ops/deployment/pubSubConfigDeploymentRouter-log"
}
The Router Function
Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name.
The router then gets the value of this key in the message and creates a destination topic name from the value.
{
"environment": "test",
"configs": [{
"type": "function",
"artifactPathOrUrl": "http://repo-name/project-name/example-ignite-function-1.0.1-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"inputs": [
"persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite",
"logTopic": "persistent://public/default/function-log-topic"
}]
}
From the message below, the router creates:
/ops/deployment-automation/test
and routes the message there
{
"environment": "test",
"configs": [{
"type": "function",
"artifactPathOrUrl": "http://repo-name/project-name/example-ignite-function-1.0.1-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"inputs": [
"persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite",
"logTopic": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite-log”
},
{
"type": "function",
"artifactPathOrUrl": "http://repo-name/project-name/example-filter-function-1.0.0-7-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleFilterFunction",
"className": "com.yourcompany.pulsar.functions.ExampleFilterFunction",
"inputs": [
"persistent://feeds/exampleProject/raw-data”
],
"output": "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite",
"logTopic": "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite-log”
}
]
}
The Router Function
Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name.
The router then gets the value of this key in the message and creates a destination topic name from the value.
{
"environment": "test",
"configs": [{
"type": "function",
"artifactPathOrUrl": "http://repo-name/project-name/example-ignite-function-1.0.1-3-jar-with-dependencies.jar",
"tenant": "exampleTenant",
"namespace": "exampleNamespace",
"name": "exampleIgniteFunction",
"className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction",
"inputs": [
"persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite"
],
"output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite",
"logTopic": "persistent://public/default/function-log-topic"
}]
}
From the message below, the router creates:
/ops/deployment-automation/test
and routes the message there
Synchronous Artifact
Download/Upload
(1)
(2)
Push for real-
time updates
Pull to get
all data
UI Tool
Server Sent Events (SSE’s)
Option 1 - Basic function CI/CD flow
Server Sent Events (SSE’s)
UI Tool
Synchronous Artifact
Download/Upload
(1)
(2)
Query to get all places
where the artifact has
been used.
Enrich the JSON with
this data.
Update configs
to use new
artifact
(1) Update configs in
CouchDB by writing as
staged
Once staged configs are approved,
push into test or prod environments
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pull to get all data
Option 2 - more advanced function CI/CD flow for reusable functions
Option 3 - more advanced function CI/CD flow for reusable functions with more decoupling from DB
Server Sent Events (SSE’s)
UI Tool
Synchronous Artifact
Download/Upload
(1)
(2)
Query to get all places
where the artifact has
been used.
Enrich the JSON with
this data.
Update configs
to use new
artifact
(1) Update configs in
CouchDB by writing as
staged
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pass command
Synchronously
execute
CouchDB
command
Be careful to avoid creating security
risks with how you implement this
e.g.
“merge-stage-sets”,
“commit-staged-to-test”,
“commit-staged-to-prod”,
“un-stage”,
“rollback”,
“get-all-data”,
etc.
(in a JSON object with any
additional parameters)
(1)
(2) Return result
Build System Storage
Get our
artifact URL
(and any
necessary
metadata, if
applicable)
WebHook Filter/Transform
Build System Storage
Build/storage data
Get our
artifact URL
(and any
necessary
metadata, if
applicable)
AWS CodePipeline S3
Github Web Hook (1)
(2)
Passes metadata and reference to S3 artifact
Pulsar Beam
or equivalent HTTP Endpoint for Pulsar
Pulsar Brokers
Granting access to download artifacts in S3
. . .
Write JSON to Pulsar
Github Web Hook
(2)
Passes metadata and reference to S3 artifact
Pulsar Beam
or equivalent HTTP Endpoint for Pulsar
Pulsar Brokers
Granting access to download artifacts in S3
. . .
Write JSON to Pulsar
GCP Cloud Build
GCP IAM
(1)
Build System
Storage
Build/storage data
Get our
artifact URL
(and any
necessary
metadata, if
applicable)
Filter/Transform
This was best done in Scala
You could do the download asynchronously at a different point in the
flow, but then you will need to ensure it’s fully downloaded before
pushing the deployment from the UI
Synchronous Artifact
Download/Upload
(1)
(2)
Security checking logic, such as package
vulnerability checks
Option 1 - Basic function CI/CD flow
Push for real-
time updates
Pull to get
all data
Deploy to test Deploy to prod
fast-deploy-go
Test Pulsar REST Admin API Prod Pulsar REST Admin API
fast-deploy-go
Router
UI Tool
Server Sent Events (SSE’s)
WebHook
Download artifact to store in CouchDB
Option 2 - more advanced function CI/CD flow for reusable functions
Deploy to test Deploy to prod
fast-deploy-go
Test Pulsar REST Admin API Prod Pulsar REST Admin API
fast-deploy-go
Router
Server Sent Events (SSE’s)
UI Tool
You could do the download asynchronously at a different point in the
flow, but then you will need to ensure it’s fully downloaded before
pushing the deployment from the UI
Synchronous Artifact
Download/Upload
(1)
(2)
Query to get all places
where the artifact has
been used.
Enrich the JSON with
this data.
Update configs
to use new
artifact
(1) Update configs in
CouchDB by writing as
staged
Once staged configs are approved,
push into test or prod environments
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pull to get all data
Filter/Transform
This was best done in Scala
WebHook
Download artifact to store in CouchDB
Option 3 - more advanced function CI/CD flow for reusable functions with more decoupling from DB
Deploy to test Deploy to prod
fast-deploy-go
Test Pulsar REST Admin API Prod Pulsar REST Admin API
fast-deploy-go
Router
Server Sent Events (SSE’s)
UI Tool
You could do the download asynchronously at a different point in the
flow, but then you will need to ensure it’s fully downloaded before
pushing the deployment from the UI
Synchronous Artifact
Download/Upload
(1)
(2)
Query to get all places
where the artifact has
been used.
Enrich the JSON with
this data.
Update configs
to use new
artifact
(1) Update configs in
CouchDB by writing as
staged
Synchronously
stage changes in
DB. (Add to
stage set.)
(2)
Push for real-
time updates
Pass command
Synchronously
execute
CouchDB
command
Be careful to avoid creating security
risks with how you implement this
e.g.
“merge-stage-sets”,
“commit-staged-to-test”,
“commit-staged-to-prod”,
“un-stage”,
“rollback”,
“get-all-data”,
etc.
(in a JSON object with any
additional parameters)
(1)
(2) Return result
Filter/Transform
This was best done in Scala
WebHook
Download artifact to store in CouchDB
What about Pub/Sub?
-backfill
-backfill
-backfill
-backfill
-backfill
User
Request new topic for SNOW Request feed
Request datasource
Approval Gate
ACL approver DataEng
Saves back to SNOW table
(workflow is triggered on write)
Generate
function configs
Generate role
configs
Generate token
configs
Generate tap
function configs
Generate
validation
function configs
Generate
passthrough
function configs
SNOW = Service Now
Fast-Deploy
Report functions
deployed for topic
Role Generator
Report roles
created for topic
Token Generator
Report tokens
created for topic
Flink keyBy request ID
window with 60 second timeout
Save configs of what was created
Add into single
JSON array of
function configs
Router
SNOW Request
Could be modified to use custom UI instead
Populates template for configs for request ID
Be sure to pass the request ID
with each JSON object to
allow all configs to be joined
to the user request after
deployment!
Note: One request ID represents all configs produced by this template
Router removes the routing envelope since it won’t be needed downstream
Note: We created the token generator
as a producer/consumer due to a lack
of available API to generate tokens. So,
we needed to use the Pulsar CLI, which
meant that we needed a disk location to
save the token.
Check if all required objects were created
or if anything is missing.
Report any problems to DataEng. Else,
notify user that their topic is ready and
provide them with the tokens and
connection details.
Notification function that sends Email, UI,
and/or Slack notification.
Request new topic for SNOW Request feed
Request datasource
Approval Gate
ACL approver DataEng
Saves back to SNOW table
(workflow is triggered on write)
SNOW = Service Now
SNOW Request
Could be modified to use custom UI instead
User
{
"project": "<team-or-project-or-category>",
"name": "<name-of-the-datasource>",
"backfill": <true or false>
}
Topic Passthrough Topic
Backfill-Topic Backfill-Passthrough
ValidationTap
/feeds/[project]/[name]
SourceTap
SourceTap
/[project]/ingest/[name]
/[project]/ingest/[name]-backfill
/discovery/taps/[project]_[name]-SourceTap
/validation/tap/[project]_[name]-FeedTap
/discovery/taps/[project]_[name]-backfill-SourceTap
/[project]/ingest/[name]-Passthrough
/[project]/ingest/[name]-backfill-Passthrough
Generate
function configs
Generate role
configs
Generate token
configs
Generate tap
function configs
Generate
validation
function configs
Generate
passthrough
function configs
Add into single
JSON array of
function configs
Populates template for configs for request ID
Be sure to pass the request ID
with each JSON object to
allow all configs to be joined
to the user request after
deployment!
Note: One request ID represents all configs produced by this template
Fast-Deploy
Report functions
deployed for topic
Role Generator
Report roles
created for topic
Token Generator
Report tokens
created for topic
Flink keyBy request ID
window with 60 second timeout
Router
Router removes the routing envelope since it won’t be needed downstream
Note: We created the token generator
as a producer/consumer due to a lack
of available API to generate tokens. So,
we needed to use the Pulsar CLI, which
meant that we needed a disk location to
save the token.
Save configs of what was created
Check if all required objects were created
or if anything is missing.
Report any problems to DataEng. Else,
notify user that their topic is ready and
provide them with the tokens and
connection details.
Notification function that sends Email, UI,
and/or Slack notification.
User
Request new topic for SNOW Request feed
Request datasource
Approval Gate
ACL approver DataEng
Saves back to SNOW table
(workflow is triggered on write)
Generate
function configs
Generate role
configs
Generate token
configs
Generate tap
function configs
Generate
validation
function configs
Generate
passthrough
function configs
SNOW = Service Now
Fast-Deploy
Report functions
deployed for topic
Role Generator
Report roles
created for topic
Token Generator
Report tokens
created for topic
Flink keyBy request ID
window with 60 second timeout
Save configs of what was created
Add into single
JSON array of
function configs
Router
SNOW Request
Could be modified to use custom UI instead
Populates template for configs for request ID
Be sure to pass the request ID
with each JSON object to
allow all configs to be joined
to the user request after
deployment!
Note: One request ID represents all configs produced by this template
Router removes the routing envelope since it won’t be needed downstream
Note: We created the token generator
as a producer/consumer due to a lack
of available API to generate tokens. So,
we needed to use the Pulsar CLI, which
meant that we needed a disk location to
save the token.
Check if all required objects were created
or if anything is missing.
Report any problems to DataEng. Else,
notify user that their topic is ready and
provide them with the tokens and
connection details.
Notification function that sends Email, UI,
and/or Slack notification.
Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz-
FruOGoo&feature=youtu.be
Performance Architecture Deep Dive:
https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/
How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works
2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be
Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be
(my most popular video yet!)
You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz-
FruOGoo&feature=youtu.be
Performance Architecture Deep Dive:
https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/
How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works
2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be
Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be
(my most popular video yet!)
You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz-
FruOGoo&feature=youtu.be
Performance Architecture Deep Dive:
https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/
How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works
2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be
Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be
(my most popular video yet!)
You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
Questions?
Pulsar Architectural Patterns for CI/CD
Every pattern shown here has been developed and implemented with my
team at Overstock
Email: dbost@overstock.com
Twitter: DevinBost
LinkedIn: https://www.linkedin.com/in/devinbost/
By Devin Bost, Senior Data Engineer at Overstock
Data-Driven CI/CD Automation for Pulsar Function Flows and Pub/Sub
+
Includes on-prem, AWS, and GCP architectures

More Related Content

Pulsar Architectural Patterns for CI/CD Automation and Self-Service_Devin Bost

  • 1. Pulsar Architectural Patterns for CI/CD Every pattern shown here has been developed and implemented with my team at Overstock Email: [email protected] Twitter: DevinBost LinkedIn: https://www.linkedin.com/in/devinbost/ By Devin Bost, Senior Data Engineer at Overstock Data-Driven CI/CD Automation for Pulsar Function Flows and Pub/Sub + Includes on-prem, AWS, and GCP architectures
  • 2. Legend & Referenced Technologies Pulsar Beam Pulsar Topic AWS CodePipeline Pulsar Brokers Kubernetes Golang Amazon S3 CouchDB ReactJS Docker AWS IAM GCP Cloud Build GCP IAM GCP Cloud Storage Google Cloud Functions Pulsar Function Flink Job Sonotype Nexus
  • 15. Data + Contact = Modular design +
  • 22. Might need to manually satisfy contract at first
  • 23. Might need to manually satisfy contract at first
  • 24. Might need to manually satisfy contract at firstUntil you can get to where the data is originated
  • 25. Build tool Artifact Storage Build data Build tool Artifact Storage Storage data (1) (2) Filter to artifact data Store Filter to artifact data Store Push to gate keeping system Push to gate keeping system Push to deployment pipeline for desired environment Push to deployment pipeline for desired environment
  • 28. { "type": "function", "artifactPathOrUrl": "http://path-to-artifact/example-ignite-function-1.0.1-20200125.003935-3- jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction-backfill”, "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "userConfig": { "username": "igniteUser", "password": "exampleHashedPass", "cache_name": "example-ignite-cache-backfill”, "hosts_with_ports": "igniteserver1.domain.com:10800, igniteserver2.domain.com:10800, igniteserver3.domain.com:10800, igniteserver4.domain.com:10800 }, "inputs": [ "persistent://feeds/exampleProject/data-to-dump-into-ignite-backfill” ], "output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite-backfill”, "logTopic": "persistent://public/default/function-log-topic-backfill” }
  • 29. Using the Java Admin API to consume from a Pulsar topic Pulsar REST Admin API Consumer/Producer { "type": "function", "artifactPathOrUrl": "http://path-to-artifact/example-ignite- function-1.0.1-20200125.003935-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "userConfig": { "username": "igniteUser", "password": "exampleHashedPass", "cache_name": "example-ignite-cache", "hosts_with_ports": "igniteserver1.domain.com:10800, igniteserver2.domain.com:10800, igniteserver3.domain.com:10800, igniteserver4.domain.com:10800 }, "inputs": [ "persistent://feeds/exampleProject/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data- enriched-from-ignite", "logTopic": "persistent://public/default/function-log-topic" } Pulsar Brokers via Java Admin API
  • 30. More direct, faster, cleaner, and half the code volume Pulsar REST Admin API Consumer/Producer { "type": "function", "artifactPathOrUrl": "http://path-to-artifact/example-ignite- function-1.0.1-20200125.003935-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "userConfig": { "username": "igniteUser", "password": "exampleHashedPass", "cache_name": "example-ignite-cache", "hosts_with_ports": "igniteserver1.domain.com:10800, igniteserver2.domain.com:10800, igniteserver3.domain.com:10800, igniteserver4.domain.com:10800 }, "inputs": [ "persistent://feeds/exampleProject/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data- enriched-from-ignite", "logTopic": "persistent://public/default/function-log-topic" } Pulsar Brokers
  • 31. Higher-availability option Consumer/Producer Consumer/Producer Consumer/Producer Pulsar REST Admin API { "type": "function", "artifactPathOrUrl": "http://path-to-artifact/example-ignite- function-1.0.1-20200125.003935-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "userConfig": { "username": "igniteUser", "password": "exampleHashedPass", "cache_name": "example-ignite-cache", "hosts_with_ports": "igniteserver1.domain.com:10800, igniteserver2.domain.com:10800, igniteserver3.domain.com:10800, igniteserver4.domain.com:10800 }, "inputs": [ "persistent://feeds/exampleProject/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data- enriched-from-ignite", "logTopic": "persistent://public/default/function-log-topic" } Pulsar Brokers via Java Admin API via Java Admin API via Java Admin API
  • 32. Fast-deploy Pulsar REST Admin API { "type": "function", "artifactPathOrUrl": "http://path-to-artifact/example-ignite- function-1.0.1-20200125.003935-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "userConfig": { "username": "igniteUser", "password": "exampleHashedPass", "cache_name": "example-ignite-cache", "hosts_with_ports": "igniteserver1.domain.com:10800, igniteserver2.domain.com:10800, igniteserver3.domain.com:10800, igniteserver4.domain.com:10800 }, "inputs": [ "persistent://feeds/exampleProject/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data- enriched-from-ignite", "logTopic": "persistent://public/default/function-log-topic" } Pulsar Brokers Or, as a Pulsar function
  • 33. Deploy to test Deploy to prod fast-deploy-go Test Pulsar REST Admin API Prod Pulsar REST Admin API fast-deploy-go Router
  • 34. The Router Function Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name. The router then gets the value of this key in the message and creates a destination topic name from the value. { "type": "function", "artifactPathOrUrl": "http://pulsar/reusable-functions/generic-router-function-1.0.1-8-jar-with-dependencies.jar", "tenant": "ops", "namespace": "deployment", "name": "pubSubConfigDeploymentRouter", "className": "com.yourcompany.pulsar.functions.GenericRouterFunction", "userConfig": { "key": "environment", "tenant": "ops", "namespace" : "deployment-automation" }, "inputs": [ "persistent://ops/deployment/pre-deployment-configs-output" ], "logTopic": "persistent://ops/deployment/pubSubConfigDeploymentRouter-log" } Creates /ops/deployment-automation/[environment]
  • 35. The Router Function Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name. The router then gets the value of this key in the message and creates a destination topic name from the value. { "type": "function", "artifactPathOrUrl": "http://pulsar/reusable-functions/generic-router-function-1.0.1-8-jar-with-dependencies.jar", "tenant": "ops", "namespace": "deployment", "name": "pubSubConfigDeploymentRouter", "className": "com.yourcompany.pulsar.functions.GenericRouterFunction", "userConfig": { "key": "environment", "tenant": "ops", "namespace" : "deployment-automation" }, "inputs": [ "persistent://ops/deployment/pre-deployment-configs-output" ], "logTopic": "persistent://ops/deployment/pubSubConfigDeploymentRouter-log" } Creates /ops/deployment-automation/[environment]
  • 36. The Router Function Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name. The router then gets the value of this key in the message and creates a destination topic name from the value. Creates /ops/deployment-automation/[environment] { "type": "function", "artifactPathOrUrl": "http://pulsar/reusable-functions/generic-router-function-1.0.1-8-jar-with-dependencies.jar", "tenant": "ops", "namespace": "deployment", "name": "pubSubConfigDeploymentRouter", "className": "com.yourcompany.pulsar.functions.GenericRouterFunction", "userConfig": { "key": “generator-type”, "tenant": "ops", "namespace" : "deployment-automation" }, "inputs": [ "persistent://ops/deployment/pre-deployment-configs-output" ], "logTopic": "persistent://ops/deployment/pubSubConfigDeploymentRouter-log" }
  • 37. The Router Function Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name. The router then gets the value of this key in the message and creates a destination topic name from the value. { "environment": "test", "configs": [{ "type": "function", "artifactPathOrUrl": "http://repo-name/project-name/example-ignite-function-1.0.1-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "inputs": [ "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite", "logTopic": "persistent://public/default/function-log-topic" }] } From the message below, the router creates: /ops/deployment-automation/test and routes the message there
  • 38. { "environment": "test", "configs": [{ "type": "function", "artifactPathOrUrl": "http://repo-name/project-name/example-ignite-function-1.0.1-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "inputs": [ "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite", "logTopic": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite-log” }, { "type": "function", "artifactPathOrUrl": "http://repo-name/project-name/example-filter-function-1.0.0-7-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleFilterFunction", "className": "com.yourcompany.pulsar.functions.ExampleFilterFunction", "inputs": [ "persistent://feeds/exampleProject/raw-data” ], "output": "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite", "logTopic": "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite-log” } ] }
  • 39. The Router Function Router’s Function Config specifies a key in the message, such as “environment”, along with a tenant and namespace name. The router then gets the value of this key in the message and creates a destination topic name from the value. { "environment": "test", "configs": [{ "type": "function", "artifactPathOrUrl": "http://repo-name/project-name/example-ignite-function-1.0.1-3-jar-with-dependencies.jar", "tenant": "exampleTenant", "namespace": "exampleNamespace", "name": "exampleIgniteFunction", "className": "com.yourcompany.pulsar.functions.ExampleIgniteFunction", "inputs": [ "persistent://exampleTenant/exampleNamespace/data-to-dump-into-ignite" ], "output": "persistent://exampleTenant/exampleNamespace/data-enriched-from-ignite", "logTopic": "persistent://public/default/function-log-topic" }] } From the message below, the router creates: /ops/deployment-automation/test and routes the message there
  • 40. Synchronous Artifact Download/Upload (1) (2) Push for real- time updates Pull to get all data UI Tool Server Sent Events (SSE’s) Option 1 - Basic function CI/CD flow
  • 41. Server Sent Events (SSE’s) UI Tool Synchronous Artifact Download/Upload (1) (2) Query to get all places where the artifact has been used. Enrich the JSON with this data. Update configs to use new artifact (1) Update configs in CouchDB by writing as staged Once staged configs are approved, push into test or prod environments Synchronously stage changes in DB. (Add to stage set.) (2) Push for real- time updates Pull to get all data Option 2 - more advanced function CI/CD flow for reusable functions
  • 42. Option 3 - more advanced function CI/CD flow for reusable functions with more decoupling from DB Server Sent Events (SSE’s) UI Tool Synchronous Artifact Download/Upload (1) (2) Query to get all places where the artifact has been used. Enrich the JSON with this data. Update configs to use new artifact (1) Update configs in CouchDB by writing as staged Synchronously stage changes in DB. (Add to stage set.) (2) Push for real- time updates Pass command Synchronously execute CouchDB command Be careful to avoid creating security risks with how you implement this e.g. “merge-stage-sets”, “commit-staged-to-test”, “commit-staged-to-prod”, “un-stage”, “rollback”, “get-all-data”, etc. (in a JSON object with any additional parameters) (1) (2) Return result
  • 43. Build System Storage Get our artifact URL (and any necessary metadata, if applicable) WebHook Filter/Transform
  • 44. Build System Storage Build/storage data Get our artifact URL (and any necessary metadata, if applicable) AWS CodePipeline S3 Github Web Hook (1) (2) Passes metadata and reference to S3 artifact Pulsar Beam or equivalent HTTP Endpoint for Pulsar Pulsar Brokers Granting access to download artifacts in S3 . . . Write JSON to Pulsar
  • 45. Github Web Hook (2) Passes metadata and reference to S3 artifact Pulsar Beam or equivalent HTTP Endpoint for Pulsar Pulsar Brokers Granting access to download artifacts in S3 . . . Write JSON to Pulsar GCP Cloud Build GCP IAM (1) Build System Storage Build/storage data Get our artifact URL (and any necessary metadata, if applicable)
  • 46. Filter/Transform This was best done in Scala You could do the download asynchronously at a different point in the flow, but then you will need to ensure it’s fully downloaded before pushing the deployment from the UI Synchronous Artifact Download/Upload (1) (2) Security checking logic, such as package vulnerability checks Option 1 - Basic function CI/CD flow Push for real- time updates Pull to get all data Deploy to test Deploy to prod fast-deploy-go Test Pulsar REST Admin API Prod Pulsar REST Admin API fast-deploy-go Router UI Tool Server Sent Events (SSE’s) WebHook Download artifact to store in CouchDB
  • 47. Option 2 - more advanced function CI/CD flow for reusable functions Deploy to test Deploy to prod fast-deploy-go Test Pulsar REST Admin API Prod Pulsar REST Admin API fast-deploy-go Router Server Sent Events (SSE’s) UI Tool You could do the download asynchronously at a different point in the flow, but then you will need to ensure it’s fully downloaded before pushing the deployment from the UI Synchronous Artifact Download/Upload (1) (2) Query to get all places where the artifact has been used. Enrich the JSON with this data. Update configs to use new artifact (1) Update configs in CouchDB by writing as staged Once staged configs are approved, push into test or prod environments Synchronously stage changes in DB. (Add to stage set.) (2) Push for real- time updates Pull to get all data Filter/Transform This was best done in Scala WebHook Download artifact to store in CouchDB
  • 48. Option 3 - more advanced function CI/CD flow for reusable functions with more decoupling from DB Deploy to test Deploy to prod fast-deploy-go Test Pulsar REST Admin API Prod Pulsar REST Admin API fast-deploy-go Router Server Sent Events (SSE’s) UI Tool You could do the download asynchronously at a different point in the flow, but then you will need to ensure it’s fully downloaded before pushing the deployment from the UI Synchronous Artifact Download/Upload (1) (2) Query to get all places where the artifact has been used. Enrich the JSON with this data. Update configs to use new artifact (1) Update configs in CouchDB by writing as staged Synchronously stage changes in DB. (Add to stage set.) (2) Push for real- time updates Pass command Synchronously execute CouchDB command Be careful to avoid creating security risks with how you implement this e.g. “merge-stage-sets”, “commit-staged-to-test”, “commit-staged-to-prod”, “un-stage”, “rollback”, “get-all-data”, etc. (in a JSON object with any additional parameters) (1) (2) Return result Filter/Transform This was best done in Scala WebHook Download artifact to store in CouchDB
  • 51. User Request new topic for SNOW Request feed Request datasource Approval Gate ACL approver DataEng Saves back to SNOW table (workflow is triggered on write) Generate function configs Generate role configs Generate token configs Generate tap function configs Generate validation function configs Generate passthrough function configs SNOW = Service Now Fast-Deploy Report functions deployed for topic Role Generator Report roles created for topic Token Generator Report tokens created for topic Flink keyBy request ID window with 60 second timeout Save configs of what was created Add into single JSON array of function configs Router SNOW Request Could be modified to use custom UI instead Populates template for configs for request ID Be sure to pass the request ID with each JSON object to allow all configs to be joined to the user request after deployment! Note: One request ID represents all configs produced by this template Router removes the routing envelope since it won’t be needed downstream Note: We created the token generator as a producer/consumer due to a lack of available API to generate tokens. So, we needed to use the Pulsar CLI, which meant that we needed a disk location to save the token. Check if all required objects were created or if anything is missing. Report any problems to DataEng. Else, notify user that their topic is ready and provide them with the tokens and connection details. Notification function that sends Email, UI, and/or Slack notification.
  • 52. Request new topic for SNOW Request feed Request datasource Approval Gate ACL approver DataEng Saves back to SNOW table (workflow is triggered on write) SNOW = Service Now SNOW Request Could be modified to use custom UI instead User
  • 54. Topic Passthrough Topic Backfill-Topic Backfill-Passthrough ValidationTap /feeds/[project]/[name] SourceTap SourceTap /[project]/ingest/[name] /[project]/ingest/[name]-backfill /discovery/taps/[project]_[name]-SourceTap /validation/tap/[project]_[name]-FeedTap /discovery/taps/[project]_[name]-backfill-SourceTap /[project]/ingest/[name]-Passthrough /[project]/ingest/[name]-backfill-Passthrough
  • 55. Generate function configs Generate role configs Generate token configs Generate tap function configs Generate validation function configs Generate passthrough function configs Add into single JSON array of function configs Populates template for configs for request ID Be sure to pass the request ID with each JSON object to allow all configs to be joined to the user request after deployment! Note: One request ID represents all configs produced by this template
  • 56. Fast-Deploy Report functions deployed for topic Role Generator Report roles created for topic Token Generator Report tokens created for topic Flink keyBy request ID window with 60 second timeout Router Router removes the routing envelope since it won’t be needed downstream Note: We created the token generator as a producer/consumer due to a lack of available API to generate tokens. So, we needed to use the Pulsar CLI, which meant that we needed a disk location to save the token.
  • 57. Save configs of what was created Check if all required objects were created or if anything is missing. Report any problems to DataEng. Else, notify user that their topic is ready and provide them with the tokens and connection details. Notification function that sends Email, UI, and/or Slack notification.
  • 58. User Request new topic for SNOW Request feed Request datasource Approval Gate ACL approver DataEng Saves back to SNOW table (workflow is triggered on write) Generate function configs Generate role configs Generate token configs Generate tap function configs Generate validation function configs Generate passthrough function configs SNOW = Service Now Fast-Deploy Report functions deployed for topic Role Generator Report roles created for topic Token Generator Report tokens created for topic Flink keyBy request ID window with 60 second timeout Save configs of what was created Add into single JSON array of function configs Router SNOW Request Could be modified to use custom UI instead Populates template for configs for request ID Be sure to pass the request ID with each JSON object to allow all configs to be joined to the user request after deployment! Note: One request ID represents all configs produced by this template Router removes the routing envelope since it won’t be needed downstream Note: We created the token generator as a producer/consumer due to a lack of available API to generate tokens. So, we needed to use the Pulsar CLI, which meant that we needed a disk location to save the token. Check if all required objects were created or if anything is missing. Report any problems to DataEng. Else, notify user that their topic is ready and provide them with the tokens and connection details. Notification function that sends Email, UI, and/or Slack notification.
  • 59. Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz- FruOGoo&feature=youtu.be Performance Architecture Deep Dive: https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/ How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works 2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/ Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be (my most popular video yet!) You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
  • 60. Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz- FruOGoo&feature=youtu.be Performance Architecture Deep Dive: https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/ How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works 2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/ Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be (my most popular video yet!) You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
  • 61. Why Streaming and Pulsar – Ammunition for the Business Case: https://www.youtube.com/watch?v=qsz- FruOGoo&feature=youtu.be Performance Architecture Deep Dive: https://streamnative.io/whitepaper/taking-a-deep-dive-into-apache-pulsar-architecture-for-performance-tuning/ How Pulsar works: https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works 2020 Apache Pulsar User Survey: https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/ Basics of Pulsar architecture: https://www.youtube.com/watch?v=vlU9UegYab8&feature=youtu.be Common Pulsar Architectural Patterns: https://www.youtube.com/watch?v=pmaCG1SHAW8&feature=youtu.be (my most popular video yet!) You can learn more about Pulsar Beam here: https://kafkaesque.io/introducing-pulsar-beam-http-for-apache-pulsar/
  • 63. Pulsar Architectural Patterns for CI/CD Every pattern shown here has been developed and implemented with my team at Overstock Email: [email protected] Twitter: DevinBost LinkedIn: https://www.linkedin.com/in/devinbost/ By Devin Bost, Senior Data Engineer at Overstock Data-Driven CI/CD Automation for Pulsar Function Flows and Pub/Sub + Includes on-prem, AWS, and GCP architectures