Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix few metric rules which were affected by the database prefix handling #13290

Merged
merged 2 commits into from
Jun 7, 2024

Conversation

shounakmk219
Copy link
Collaborator

@shounakmk219 shounakmk219 commented May 31, 2024

Description

This PR fixes few jmx metric rules that were broken due to the database prefix handling on metrics.

  1. moved a generic rule in server rules down the chain to avoid unintended matches
    name=\"pinot\\.server\\.(([^.]+)\\.)?([^.]*)_(OFFLINE|REALTIME)\\.(\\w+)\"
  2. Specific rules for few metrics which were showing up in database label due to wrong matches like
  • pinot.broker.requestSize.<tableName>
  • pinot.broker.adaptiveServerSelectorType
  • pinot.broker.adaptiveServerSelectorType.<type>
  • pinot.controller.dataDir.[exists, fileOpLatencyMs]
  • pinot.server.endToEndRealtimeIngestionDelayMs.<tableName>

labels

bugfix

@soumitra-st
Copy link
Contributor

Can you list the broken metrics and how they are fixed in this PR?

@shounakmk219
Copy link
Collaborator Author

Can you list the broken metrics and how they are fixed in this PR?

@soumitra-st Updated the description with metric details.

@soumitra-st
Copy link
Contributor

How shall we test it out? One possibility is to get the list of available metrics on broker/controller/server using curl localhost:8080 running 0.8.0, then deploy this image, do the same and compare. This is not trivial to do though.

Copy link
Contributor

@soumitra-st soumitra-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@soumitra-st
Copy link
Contributor

Looks good, deferring to @suddendust to double check.

@shounakmk219
Copy link
Collaborator Author

How shall we test it out? One possibility is to get the list of available metrics on broker/controller/server using curl localhost:8080 running 0.8.0, then deploy this image, do the same and compare. This is not trivial to do though.

Yeah it's a pain to test these changes. Right now just making sure no random string appear in table label.

@npawar
Copy link
Contributor

npawar commented May 31, 2024

Please do add testing done and screenshots of validated rules as they appear in jconsole / prometheus explore

@@ -73,6 +66,14 @@ rules:
table: "$1$3"
tableType: "$4"
partition: "$5"
- pattern: "\"org\\.apache\\.pinot\\.common\\.metrics\"<type=\"ServerMetrics\", name=\"pinot\\.server\\.endToEndRealtimeIngestionDelayMs\\.(([^.]+)\\.)?([^.]*)_(OFFLINE|REALTIME)\\.(\\w+)\"><>(\\w+)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

realtimeIngestionDelayMs is a different metric than this. Can you validate if that one is being successfully exported?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. realtimeIngestionDelayMs is handled by moving the metric rule earlier present on line 16 to line 175.
Will add screenshots on validation as well

@@ -112,6 +118,12 @@ rules:
- pattern: "\"org\\.apache\\.pinot\\.common\\.metrics\"<type=\"BrokerMetrics\", name=\"pinot\\.broker\\.routingTableUpdateTime\"><>(\\w+)"
name: "pinot_broker_routingTableUpdateTime_$1"
cache: true
- pattern: "\"org\\.apache\\.pinot\\.common\\.metrics\"<type=\"BrokerMetrics\", name=\"pinot\\.broker\\.adaptiveServerSelectorType\"><>(\\w+)"
name: "pinot_broker_adaptiveServerSelectorType_$1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add lables here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its does not have any extracted info to add as a label. More of a static config info metric

name: "pinot_broker_adaptiveServerSelectorType_$1"
cache: true
- pattern: "\"org\\.apache\\.pinot\\.common\\.metrics\"<type=\"BrokerMetrics\", name=\"pinot\\.broker\\.adaptiveServerSelectorType\\.(\\w+)\"><>(\\w+)"
name: "pinot_broker_adaptiveServerSelectorType_$1_$2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied above

cache: true
labels:
database: "$2"
table: "$1$3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to delimit $1 from $3?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$1 matches the whole databaseName. part hence no need to delimit explicitly.

cache: true
labels:
database: "$2"
table: "$1$3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied above

@shounakmk219
Copy link
Collaborator Author

Controller metrics validations:

image
image

Broker metrics validations

image
image

Server metrics validation

image
image

Copy link
Contributor

@suddendust suddendust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@KKcorps KKcorps merged commit fb64ec2 into apache:master Jun 7, 2024
2 checks passed
gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants