Skip to main content
Version: 2.13

Jobs

A Job resource represents a single Apache Flink® job. The Job resource is derived from a Deployment Template whenever Ververica Platform detects that a new Flink job is needed.

Overview

The Job resource is exclusively managed by Ververica Platform and it is not possible to create it manually. Once the Job resource is created, its spec section is effectively immutable and represents a point in time of the spec section of the owning Deployment resource.

The status.state field of the Job resource has the following states:

  • STARTING The Job is in the process of starting. This includes requesting resources from the Deployment Target and submitting the specified job to it.
  • STARTED The Job was successfully started.
  • TERMINATING The Job is in the process of terminating.
  • TERMINATED The Job has terminated.
  • FAILED The Deployment Target or Flink have reported an unrecoverable failure. Ververica Platform may report additional failure details.
  • FINISHED The Job has terminated and finished successfully, e.g. a finite streaming or batch job.

Below state machine shows the possible transitions of a Job’s status:

Image

Ververica Platform keeps all jobs in a terminal state. This allows users to trace back the evolution of the Deployment specification, such as parallelism and other configuration parameters.

note

The Job resource status.state is not the same as the Flink job status since the Job resource state includes phases of acquiring resources from the Deployment Target.

Job Status

When a Job reaches status STARTED, additional information about the Job is provided under status.started. This section of the Job API is considered experimental and no guarantees are made about future availability.

kind: Job
status:
state: started
started:
# The Flink Job ID. Currently always equal to Job.metadata.id without
# dashes, e.g. 3f1da3c8-6f22-430f-9200-eda014ea5319 becomes
# 3f1da3c86f22430f9200eda014ea5319.
flinkJobId: string
# Time that the Job transitioned to the STARTED state.
transitionTime: string
# Time of last update to {status.started}.
lastUpdateTime: string
# The last observed number of Flink job restarts, including process
# restarts. The number is a lower bound on the actual number of Flink
# job restarts. On the happy path, it should be close to the actual
# number. During failures, there are no guarantees except that it is
# lower than or equal to the actual number of restarts.
observedFlinkJobRestarts: number
# The Flink JobStatus observed in the last probe. Either a Flink Job
# status string (CREATED, RUNNING, FAILING, FAILED, CANCELLING,
# CANCELED, FINISHED, RESTARTING, SUSPENDED, RECONCILING) or UNKNOWN
# if the Flink API was not reachable.
observedFlinkJobStatus: string
# ***SQL Deployments only*** a list of all sink tables of this query
# including schema and properties
sinkTables:
- catalogName: string
databaseName: string
name: string
temporary: boolean
columns:
- name: string
type: string
properties: object
# ***SQL Deployments only*** a list of all sink tables of this query
# including schema and properties
sourceTables:
- catalogName: string
databaseName: string
name: string
temporary: boolean
columns:
- name: string
type: string
properties: object

Job storage configuration

Ververica Platform maintains a record of deployment jobs in persistent storage, as well as in AppManager memory. You can view the information regarding all the terminal jobs for a specific deployment. Terminal jobs refer to jobs that have any of these Ververica Platform specific statuses: FAILED, TERMINATED, and FINISHED.

By default, Ververica Platform stores an unlimited number of terminal jobs. However, you can configure the deletion of terminal jobs based on specific criteria such as time interval or the maximum number of jobs to be kept in history. This configuration can be done optionally during the installation of Ververica Platform.

Configuration

During the Ververica Platform installation, you can configure the following two optional properties:

  • vvp.job-service.max-number-jobs-in-terminal-state - defines the maximum number of terminal jobs per deployment that should be always kept in jobs history.
  • vvp.job-service.jobs-cleanup-interval - defines the duration of cleanup interval. Specify the interval using the Spring Duration conversion format. When these properties are provided, the AppManager scheduled task repeatedly searches for outdated deployment jobs and deletes them from both memory and persistent storage.
note

To change these parameters Ververica Platform must be stopped, and then started with the new configuration.