Savepoints

A Savepoint Resource points to a single savepoint in Apache Flink. A single Flink savepoint can be referenced by multiple Application Manager Savepoint resources.

Specification

There are different metadata.origin values for Savepoints:

  • USER_REQUEST: The savepoint has been requested manually by a user through Application Manager.
  • SUSPEND: The savepoint has been requested when the corresponding Deployment was suspended.
  • COPIED: The savepoint is a copy of another savepoint resource. Both savepoint resources point to the same physical Flink savepoint.

Application Manager does not keep track of Flink savepoints not created through Application Manager.

Attention

In order to use Application Manager features that rely on savepoints (such as stateful upgrades or suspending a Deployment), the Deployment must have the Flink configuration parameter state.savepoints.dir set in Deployment.spec.template.spec.flinkConfiguration.

Restoring From an Externalized Checkpoint

If you have configured your Flink application to externalize checkpoint metadata, you can manually configure Application Manager to use such a checkpoint for recovery.

Please refer to the Flink documentation about Externalized Checkpoints for more details about how to configure Flink for externalized checkpoints.

In the following, we assume that you already have an externalized checkpoint at hand to resume from. The following steps will allow you to resume from your desired externalized checkpoint:

POST /api/v1/namespaces/{namespace}/savepoints
metadata:
  deploymentId: d072b06a-7818-4d3f-8a30-fcde28cfc69b
  annotations:
    com.daplatform.appmanager.controller.deployment.spec.version: ${deploymentSpecVersion}
spec:
  savepointLocation: s3://location-of-checkpoint/chk-19/
  flinkSavepointId: 00000000-0000-0000-0000-000000000000
status:
  state: COMPLETED

This will create a Savepoint resource for the Deployment with ID deploymentId and point it to the checkpoint at savepointLocation. You have to extract the deploymentSpecVersion from Deployment.metadata.annotations."com.daplatform.appmanager.controller.deployment.spec.version" of the corresponding Deployment and assign it to the posted Savepoint.

Having a savepoint restore strategy to restore from the latest savepoint will automatically pick the newly created savepoint for the next execution(s) of your Deployment.

Note

You have to ensure that the provided savepointLocation is valid. If this is not the case, you will only see errors during runtime of the job(s) that try to restore from this location.

Note

If the com.daplatform.appmanager.controller.deployment.spec.version annotation is missing, the Savepoint will not be used during restore.