Monitoring and Metrics

1 min read

On this page

What’s Pre-configured?
- 1. Pod-level Prometheus Annotations
- 2. Baseline Flink Configuration
Next Steps
Reference Links

Out of the box, Flink jobs that run in a Bring-Your-Own-Cloud (BYOC) workspace expose metrics by using:

JMX (Java Management Extensions)
Prometheus (HTTP endpoint scraping)

This page describes what is already configured in your BYOC deployment so you can plug the data into your own monitoring stack (for example, Prometheus + Grafana).
Setting up or operating Prometheus / Grafana itself is outside the scope of this documentation and remains entirely under your control.

What’s Pre-configured?

1. Pod-level Prometheus Annotations

Every Flink pod (JobManager and TaskManager) includes annotations that instruct a Prometheus scraper to collect metrics automatically:

YAML

1annotations:
2  prometheus.io/path: /metrics
3  prometheus.io/port: "9999"
4  prometheus.io/scrape: "true"

prometheus.io/path: The HTTP path where metrics are exposed (/metrics).
prometheus.io/port: The container port (9999) where the metrics endpoint listens.
prometheus.io/scrape: Indicates that the pod should be scraped (true).

Tip

If you already run a Prometheus operator in the same cluster, it can discover these pods automatically based on the annotations.

2. Baseline Flink Configuration

The following metric reporters are enabled by default in the Flink cluster configuration shipped with BYOC:

YAML

1metrics.reporters: jmx:promappmgr
2
3# JMX Reporter
4metrics.reporter.jmx.factory.class: org.apache.flink.metrics.jmx.JMXReporterFactory
5metrics.reporter.jmx.port: 10000-10240  # Port range for JMX
6
7# Prometheus Reporter
8metrics.reporter.promappmgr.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory

Reporter	Purpose	Where It Listens
Reporter	Purpose	Where It Listens
JMX	For JVM-based monitoring tools or exporters.	Ports 10000–10240 on each pod.
Prometheus	Exposes human-readable metrics on the HTTP endpoint defined by the pod annotations.	Port 9999 (/metrics).

Next Steps

Scrape the Metrics
- Point your in-cluster Prometheus deployment at the Kubernetes namespace (or use ServiceMonitor objects) so it detects pods with the prometheus.io/scrape: "true"annotation. For more details, visit the official Prometheus documentation website.
Visualize in Grafana
- Build your own using the Prometheus data source.
Define Alerts
- Define alert rules in Prometheus or Grafana Alerting to monitor job health (e.g., restart count, checkpoint failures, backpressure).
Export to Other Observability Tools
- This same endpoint can be scraped directly by hosted observability platforms. See the Datadog, New Relic, and Dynatrace documentation for their own Prometheus scraping setup.

No additional configuration inside Ververica Cloud: Bring-Your-Own-Cloud is required. All metrics are emitted automatically once the Flink cluster starts.

Reference Links

Apache Flink: Metrics
Prometheus: Scrape Classes

Was this helpful?

Yes No