Skip to main content

Autopilot Integration

When Autopilot is in ACTIVE mode for a deployment, the operator provides observability into its runtime scaling decisions through the ParallelismAligned condition.

note

The Ververica Platform 2 operator had no Autopilot integration. It overwrote Autopilot's scaling decisions with the CR's parallelism on every reconcile cycle, effectively defeating Autopilot for operator-managed deployments. The Ververica Platform 3 operator resolves this.

How It Works

Autopilot operates at the Flink job runtime level — it does not modify the deployment spec. The operator always projects the CR's parallelism to the deployment spec, regardless of whether Autopilot is ACTIVE. This is safe because Autopilot's scaling decisions take effect at the Flink job level and are not in conflict with the deployment spec's parallelism value.

On every reconcile cycle, the operator checks whether Autopilot is in ACTIVE mode for the deployment. If it is, the operator resolves Autopilot's current runtime parallelism and compares it against the CR's parallelism to evaluate the ParallelismAligned condition.

The ParallelismAligned condition is a purely informational observability condition. It reports whether the CR's parallelism differs from Autopilot's runtime parallelism. See ParallelismAligned Condition for details.

During Scaling Operations

When Autopilot triggers a scaling operation, Ververica Platform temporarily changes the deployment state as part of the process (for example, RUNNINGCANCELLEDSTARTINGRUNNING). During this time, Ververica Platform's spec.state may differ from the CR's desired RUNNING state.

The operator detects active Autopilot scaling operations and activates three safeguards:

  1. State drift re-projection is suppressed. The operator does not re-project the CR's desired state, avoiding interference with Ververica Platform's job lifecycle. The SpecAligned condition reflects this as False/StateProjectionSuppressedAutopilot.
  2. Nonce operations are blocked. Savepoint and restart nonces are blocked to avoid racing with the scaling operation's job management. The NonceAligned condition reflects this as SavepointBlocked or RestartBlocked.
  3. Config changes still flow through. If you update non-state fields in the CR (for example, flinkConfiguration) during a scaling operation, the operator projects those changes normally.

All three conditions are transient and resolve automatically when the scaling operation completes. Blocked nonces require a new nonce value to retry after scaling completes — see Trigger Savepoints and Restarts.

ParallelismAligned Condition

This condition auto-heals. Unlike NonceAligned/SavepointBlocked (which requires a user action to retry), ParallelismAligned is re-evaluated on every reconcile cycle and updates automatically when the underlying state changes. No CR update is needed.

StatusReasonMeaning
TrueAlignedAutopilot is not active, has not scaled yet, the CR does not specify parallelism, or the CR's parallelism matches Autopilot's runtime parallelism.
FalseAutopilotManagedAutopilot is active and its runtime parallelism differs from what the CR specifies. This is expected behavior — Autopilot's scaling decisions take effect at the Flink job level.

AutopilotManaged auto-heals when:

  • Autopilot adjusts the deployment's runtime parallelism to match the CR's value. The condition transitions to True/Aligned on the next cycle.
  • Autopilot is disabled. The condition transitions to True/Aligned on the next cycle.

Fail-Open Behavior

If the Autopilot service is unreachable or returns an error, the operator treats Autopilot as not active and projects the CR's parallelism to Ververica Platform normally. This fail-open strategy avoids blocking reconciliation when Autopilot is temporarily unavailable.

Recommendations

  • Set parallelism in the CR even when using Autopilot. The CR's parallelism serves as the initial value before Autopilot takes over. Once Autopilot is ACTIVE and adjusts the runtime parallelism, the ParallelismAligned condition reports the difference for observability.
  • The operator does not enable or disable Autopilot. Manage Autopilot mode through the Ververica Platform UI or API. The operator only reads Autopilot's current mode — it never changes it.

Autopilot override events appear in the Ververica Platform Events tab when Autopilot's runtime parallelism differs from the CR's parallelism. See Observability for details.