Savepoints

A Savepoint Resource points to a single savepoint in Apache Flink. A single Flink savepoint can be referenced by multiple Ververica Platform Savepoint resources.

Specification

There are different metadata.origin values for Savepoints:

  • USER_REQUEST: The Savepoint has been requested manually by a user through Ververica Platform.
  • SUSPEND: The Savepoint has been requested when the corresponding Deployment was suspended.
  • COPIED: The Savepoint is a copy of another Savepoint resource. Both Savepoint resources point to the same physical Flink savepoint.

Ververica Platform does not keep track of Flink savepoints not created through Ververica Platform.

Attention

In order to use Ververica Platform features that rely on savepoints (such as stateful upgrades or suspending a Deployment), the Deployment must have the Flink configuration parameter state.savepoints.dir set in Deployment.spec.template.spec.flinkConfiguration.

Restoring From an Externalized Checkpoint

If you have configured your Flink application to externalize checkpoint metadata, you can manually configure Ververica Platform to use such a checkpoint for recovery.

Please refer to the Flink documentation about Externalized Checkpoints for more details about how to configure Flink for externalized checkpoints.

In the following, we assume that you already have an externalized checkpoint at hand to resume from. The following steps will allow you to resume from your desired externalized checkpoint:

POST /api/v1/namespaces/{namespace}/savepoints
metadata:
  deploymentId: d072b06a-7818-4d3f-8a30-fcde28cfc69b
  annotations:
    com.daplatform.appmanager.controller.deployment.spec.version: ${deploymentSpecVersion}
spec:
  savepointLocation: s3://location-of-checkpoint/chk-19/
  flinkSavepointId: 00000000-0000-0000-0000-000000000000
status:
  state: COMPLETED

This will create a Savepoint resource for the Deployment with ID deploymentId and point it to the checkpoint at savepointLocation. You have to extract the deploymentSpecVersion from Deployment.metadata.annotations."com.daplatform.appmanager.controller.deployment.spec.version" of the corresponding Deployment and assign it to the posted Savepoint.

Having a deployment restore strategy to restore from the LATEST_SAVEPOINT will automatically pick the newly created Savepoint for the next execution(s) of your Deployment. If the restore strategy is LATEST_STATE then the Savepoint will be picked if it is also the latest among checkpoints.

Note

You have to ensure that the provided savepointLocation is valid. If this is not the case, you will only see errors during runtime of the job(s) that try to restore from this location.

Note

If the com.daplatform.appmanager.controller.deployment.spec.version annotation is missing, the Savepoint will not be used during restore.