Tune Deployment Resources Without Manual Intervention
You can safely and efficiently adjust deployment resources while jobs continue running. Autopilot and Scheduled tuning modes let you schedule or automatically optimize resource allocation, including parallelism, TaskManager count, and memory without manual intervention. Updates take effect quickly, reducing the need for constant monitoring and adjustment.
Manual tuning requires evaluating metrics, configuring resources, and redeploying jobs, which can interrupt traffic or reduce throughput. For example:
- Configuring resources when publishing a draft (parallelism, TaskManager count/size, memory/CPU).
- Adjusting resources while a deployment runs to maximize utilization.
- Responding to increases in backpressure or latency.
Automated tuning reduces these risks while improving resource utilization and deployment performance. You can:
- Adjust parallelism and resource allocation more rationally and consistently.
- Improve end-to-end throughput and reduce pipeline backpressure.
- Avoid resource waste during low demand and ensure capacity during peaks.
How It Works
Ververica manages resource updates differently depending on the tuning mode. The process has three main steps:
- Select a tuning mode: Disabled, Autopilot, or Scheduled.
- Configure parameters: Define limits, schedules, or strategies for resource changes.
- Apply changes: In automated modes, the system adjusts parallelism and memory allocation, uses incremental updates, and avoids redeploying unless necessary. In manual mode, you review the suggestions and apply changes yourself.
The result is more efficient resource utilization and stable performance.
Tuning Modes
Select the tuning mode that best meets your deployment's operational requirements. The table below summarizes use cases, how each mode works, and references.
| Tuning Mode | Use Case | How it Works | Benefits | References |
|---|---|---|---|---|
| Disabled (Manual) | You want full control to review and apply resource changes yourself without automation. | The system provides optimization suggestions; you decide when and what to apply. | Full control with guidance. | — |
| Autopilot | Workloads are variable and require continuous optimization. | The system automatically adjusts resources based on live metrics. | Reduces manual effort and optimizes throughput, backpressure, and efficiency in real time. | See Default Tuning Actions and Autopilot Parameters. |
| Scheduled Mode | Traffic follows predictable patterns (daily or weekly peaks). | You define time-based resource plans. The system applies them at the configured times. | Proactive scaling; avoids thrashing and ensures capacity during known peaks. | See Run a Deployment Using Scheduled Mode. |
About Autopilot Mode
Autopilot automatically adjusts deployment resources based on real-time metrics. It detects load changes, optimizes parallelism and memory, and maintains performance without requiring manual intervention.
Autopilot supports two strategies:
-
Stable Strategy: Prioritizes convergence, avoids unnecessary adjustments, and stops after either:
-
No adjustments for 24 consecutive hours.
-
The deployment reaches a steady state for 72 hours.
When either condition is met, Autopilot stops modifying parameters. Restarting the deployment resets these counters. Editing Stable Strategy parameters does not reset them; only a deployment restart does.
-
-
Adaptive Strategy: Continuously monitors resource usage and latency, adjusting more aggressively for dynamic workloads.
Example Scenario
A deployment uses 30 CUs and runs smoothly but shows occasional very low CPU/memory usage with no delay or backpressure at the source. Enabling Autopilot allows the system to scale resources down during low usage and scale up when usage crosses predefined thresholds—without manual intervention.
Limits and Considerations
-
Unaligned Checkpoints: Parallelism cannot be changed if Unaligned Checkpoints are enabled.
-
Session Clusters: Not supported by Autopilot.
-
Performance Bottlenecks: Not all bottlenecks are internal. Autopilot works best when:
- Traffic changes smoothly,
- No data skew exists, and
- Throughput scales roughly linearly with parallelism. Otherwise, you may see failed parallelism changes, repeated restarts, degraded performance in UDSFs/UDAFs/UDTFs, or pressure on external systems when parallelism increases.
-
Deployment Restarts: Autopilot may restart deployments when applying changes, briefly pausing processing.
-
Trigger Interval: Autopilot evaluates every 10 minutes by default (configurable via
cooldown.minutes). -
Manual Parallelism Configuration: If a DataStream deployment or custom SQL connector sets parallelism explicitly, Autopilot is disabled.
-
Policy Timing: A new Autopilot policy cannot trigger within 30 minutes of a prior policy.
Default Tuning Actions
When enabled, Autopilot adjusts resources based on live metrics.
Parallelism Adjustments
Autopilot optimizes throughput by changing parallelism:
-
No change: If deployment delay remains below 60 s, parallelism stays unchanged.
-
Scale up:
- If delay exceeds 60 s and continues increasing for 3 minutes, parallelism increases (up to 2× current processing capacity, capped at 64 CUs).
- If vertex processing time exceeds 80% for 6 minutes, increase parallelism to target ~50% slot utilization.
- If average TaskManager CPU exceeds 80% for 6 minutes, increase parallelism to target ~50% CPU usage.
-
Scale down:
- If CPU utilization or vertex processing time stays below 20% for 24 hours, reduce parallelism for efficiency.
Memory Optimization
Autopilot monitors memory to prevent instability:
-
Scale up:
- Increase JobManager memory on frequent GC/OOM (up to 16 GiB).
- Increase TaskManager memory on GC/OOM/
HeartBeatTimeout(up to 16 GiB). - Increase when TaskManager memory usage exceeds 95%.
-
Scale down:
- If TaskManager memory usage remains below 30% for 24 hours, reduce allocation (minimum 1.6 GiB).
Run a Deployment Using Autopilot
You can enable and configure Autopilot when starting a job or from Deployments → Resources.
-
On the Dashboard, locate the workspace.
-
Open the workspace Console.
-
In the left navigation, click Deployments and select your deployment.
-
Choose one:
- Enable on an existing deployment: Resources tab → Autopilot Mode → toggle ON → Edit in Configurations.
- Enable at job start: Click Start → choose Initial or Resume (see Starting Jobs) → toggle Configure Autopilot ON → set Resource Tuning Mode = Autopilot Mode.
-
Select a strategy:
- Stable Strategy: Minimizes start–stop effects and converges longer-cycle jobs quickly.
- Adaptive Strategy: Prioritizes latency and utilization; reacts faster to metric changes.
-
Edit parameters. See Autopilot Parameters.
-
Click Save.
Autopilot Parameters
| Parameter | Description |
|---|---|
| Cooldown Minutes | Interval after an Autopilot-triggered restart before Autopilot evaluates again. |
| Max CPU | Maximum vCPUs allowed for the deployment. Defaults: Adaptive 64 cores, Stable 4 cores. |
| Max Memory | Maximum memory allowed. Defaults: Adaptive 256 GiB, Stable 16 GiB. |
| Max Delay | Maximum allowed source consumption delay (minutes). Default 1. Above this, Autopilot scales up by increasing parallelism or splitting chains. |
| mem.scale-up.interval | Minimum time between memory scale-ups. Default 6 minutes. |
| mem.scale-down.interval | Minimum time between memory scale-downs. Default 24 hours. |
| parallelism.scale.max | Max parallelism when increasing. Default -1 (no limit). |
| parallelism.scale.min | Min parallelism when decreasing. Default 1. |
| parallelism.scale.up.interval | Minimum time between parallelism scale-ups. Default 6 minutes (Stable & Adaptive). |
| parallelism.scale.down.interval | Minimum time between parallelism scale-downs. Defaults: Adaptive 24 hours, Stable 11 hours. |
| delay-detector.scale-up.threshold | Threshold for currentFetchEventTimeLag (ms). Default 1 ms. |
| slot-usage-detector.scale-up.threshold | If average vertex processing time ratio > 0.8 over the sample interval, scale up. |
| slot-usage-detector.scale-down.threshold | If average vertex processing time ratio < 0.2 over the sample interval, scale down. |
| slot-usage-detector.scale-up.sample-interval | Sampling window for idle/processing time averaging. Default 3 minutes. |
| resources.memory-scale-up.max | Max memory per TaskManager and JobManager when scaling up. Default 16 GiB. |
About Scheduled Mode
Scheduled Mode is ideal when you know peak patterns in advance—big events (for example, Black Friday), recurring weekly spikes (for example, match days), or predictable daily cycles.
It also covers scenarios where Autopilot may be sub-optimal:
- Frequent jitter can cause repeated restarts under Autopilot; a schedule avoids thrashing.
- Slow trend changes may require multiple Autopilot iterations to converge; a schedule can jump directly to the right profile.
Example (Scheduled plan) If peak usage runs 09:00–19:00 and off-peak 19:00–09:00, create a plan to allocate 30 CUs during peak and 10 CUs off-peak. A plan can contain multiple time–resource mappings.
Run a Deployment Using Scheduled Mode
To use Scheduled Mode, create at least one plan and apply it to the deployment.
Create a Plan
Scheduled plans apply to all running jobs under the deployment once enabled.
-
In the Ververica Cloud console, open Deployments → Resources.
-
Click Scheduled Mode.
-
In Resource Plans, click New Plan.
-
Enter a Plan Name and set:
- Trigger Period: No Repeat, Every Day, Every Week, or Every Month. For weekly/monthly, specify the effective time range.
- Trigger Time: When the plan takes effect.
- For other parameters, see Resources and Parameters.
-
(Optional) Click New Resource Setting Period to add another time window with different resources.
-
Click OK. The plan appears in Resource Plans.
Start a Job Using a Scheduled Plan
-
Select the deployment.
-
Click Start in the toolbar.
Note There must be at least one saved plan to apply at startup.
-
In Start Job, choose Initial or Resume.
-
Optionally set a start time.
-
Toggle Configure Autopilot ON.
-
Set Resource Tuning Mode = Scheduled Mode.
-
Choose a plan from the list.
Note To create a new plan from here, select Create new scheduled plan to return to Resources and follow Create a Plan.
-
Click Start.
Change the Applied Plan
Changing plans may restart the job.
- In Deployments → Resources, locate the applied plan.
- Click Stop Applying (either beside the plan entry or the main Stop Applying button).
- Click Apply on the new plan.
Edit an Existing Plan
- You cannot rename a plan (create a new one instead).
- You cannot edit a plan that’s currently applied to a running job.
- Open Deployments → Resources.
- Click the plan name or Details.
- Click Edit, change parameters, and Save.
Delete a Plan
You cannot delete a plan that’s currently applied.
- In Deployments → Resources, find the plan.
- Click Delete, then OK to confirm.