Deployment Templates

Deployment Templates are specified as part of Deployment resources in spec.template. Templates are translated into Job resources when you set the desired state of a Deployment to running.

You can think of the relation between the Deployment spec and the template as follows:

  • The template specifies which Flink job is executed and how to execute it, including its configuration.
  • The Deployment spec defines how Job instances are managed over time, for instance how to execute upgrades or which Savepoint to restore from.

At the end of this page, you will find a full example that creates a deployment for a running Flink example job. The following sections will break down each part of the template in more detail.

Metadata and Spec

The template has two parts, metadata and spec. The metadata section currently only accepts optional annotations. The spec specifies which artifact to execute and how to execute it.

kind: Deployment
spec:
  template:
    metadata:
      ...
    spec:
      ...

Note

The resource snippets in the following sections are not valid on their own and should be pasted in under the spec section of the template Deployment.spec.template.spec.

Artifacts

The artifact section of the template specifies which artifact to execute. Currently, there is a single kind of supported artifact, jar.

JAR Artifacts

JAR artifacts must package regular Flink jobs that are executed via the main method of their entry class.

artifact:
  kind: jar
  jarUri: https://artifacts.da-platform.com/flink-job.jar
  entryClass: com.daplatform.myjob.EntryPoint # Optional, if no Manifest entry
  mainArgs: --arg1=1 --arg2=2                   # Optional

Application Manager needs to be able to access the artifact and will submit it to the created job cluster for execution. Application Manager will download the JAR artifact, and then upload it to the Flink cluster. The container or system running Application Manager needs to have access to the JAR location.

Note

Check out the Advanced Configuration page if your artifact storage uses TLS and serves a certificate signed by a non-public CA.

Note

Application Manager does not store the JAR artifacts permanently. If you want to be able to go back to earlier versions of your Flink job, ensure that you are versioning the JARs properly, and that they are stored for the time you want to be able to go back.

Custom Docker Images

By default, containers created for a Deployment will use the configured default Flink Docker images. You can overwrite this on a per Deployment basis as part of the artifact.

artifact:
  kind: jar
  jarUri: https://artifacts.da-platform.com/flink-job.jar
  flinkVersion: 1.6
  flinkImageRegistry: registry.platform.data-artisans.net
  flinkImageRepository: v1.2/flink
  flinkImageTag: 1.6.0-dap1-scala_2.11

Each flink* attribute is optional. If not provided, it will fall back to the configured default of your Application Manager installation.

You can specify the Flink image by digest if you prefix the flinkImageTag attribute with @, for instance flinkImageTag: @sha256:....

Parallelism, Number of TaskManagers, and Slots

You can specify the parallelism of your jobs via the parallelism key. By default, there will be as many TaskManager instances created as the specified parallelism. You can overwrite this behaviour via the numberOfTaskManagers key.

parallelism: 1
numberOfTaskManagers: 1 # Optional, defaults to parallelism

Each TaskManager will have Flink’s default setting for number of task slots (currently a single slot per TaskManager). Combining the configuration of parallelism and numberOfTaskManagers with the taskmanager.numberOfTaskSlots option of Flink gives you full flexibility in how to execute your jobs.

In the following snippet, we specify two TaskManager instances with 4 slots each and a parallelism of 8 for our jobs.

parallelism: 8
numberOfTaskManagers: 2
flinkConfiguration:
  taskmanager.numberOfTaskSlots: 4

For more details on Flink task slots, consult the offical Flink documentation.

Compute Resources

You can configure requested compute resources for CPU and memory via the resources:

resources:
  jobmanager:
    cpu: 0.5
    memory: 500M
  taskmanager:
    cpu: 1
    memory: 2G

The keys jobmanager and taskmanager configure the respective Flink components. By default, the above values are used as defaults. You can overwrite each value selectively.

Note that resources are configured per component instance, e.g. if you use 10 TaskManager instances, Application Manager will in total try to acquire 10 times the configured taskmanager resources.

CPU

You can specify CPU as a decimal number, e.g. 1, 1.0, or 0.5.

Memory

You can specify memory as a decimal number with an optional memory unit as indicated by the suffix (ignoring case):

  • Gigabytes G, e.g. 1G or 1.5G
  • Megabytes M, e.g. 1000M or 1500M

If no unit is specified, the provided number is interpreted as bytes. Each memory unit is interpreted as a power of ten, e.g. 1K equals 1000 bytes.

Note

The CPU resources you configure for your deployments are counted against your licensed resource quota (CPU Quota).

JVM Heap Size

Application Manager will respect the container memory cut-offs specified in flinkConfiguration similar to Apache Flink’s behaviour.

By default, the maximum of the following two values is subtracted from the requested memory and set as the JVM heap size:

  • Cutoff Fraction: 25% of the configured memory
  • Minimum cutoff: 600 MiB

For the memory request of 2G, this means that 600 MiB will be cutoff from the configured heap (as 25% of 2G is less than 600MiB).

You can use containerized.heap-cutoff-min (default: 600) and containerized.heap-cutoff-ratio (default: 0.25) in flinkConfiguration to adjust these values. For more details, consult the offical Flink documentation on Configuration.

Logging

You can customize the default log4j logger configuration by setting log levels for your desired loggers.

logging:
  log4jLoggers:
    "": INFO            # Root log level
    com.company: DEBUG  # Log level of com.company

The default log level for org.apache.flink is INFO.

Kubernetes Options

It is possible to specify Kubernetes-specific options for a Deployment. These options will be forwarded to the pods created by Application Manager.

Please check out the Configure Kubernetes for more details.

Full Example

kind: Deployment
apiVersion: v1
metadata:
  name: TopSpeedWindowing Example
  labels:
    env: testing
spec:
  state: running
  deploymentTargetId: 57b4c290-73ad-11e7-8cf7-a6006ad3dba0
  startFromSavepoint:
    kind: latest
  upgradeStrategy:
    kind: stateless
  template:
    spec:
      artifact:
        kind: jar
        jarUri: http://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.11/1.4.0/flink-examples-streaming_2.11-1.4.0-TopSpeedWindowing.jar
        mainArgs: --windowSize 10 --windowUnit minutes
        entryClass: org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
      parallelism: 1
      numberOfTaskManagers: 2
      resources:
        taskManager:
          memory: 1.5g
      flinkConfiguration:
        taskmanager.numberOfTaskSlots: 1
        state.savepoints.dir: s3://flink/savepoints
      logging:
        log4jLoggers:
          org.apache.flink.streaming.examples: DEBUG

You can copy-paste this Deployment resource into a file called deployment and post it via curl:

$ curl -H 'Content-Type: application/yaml' -H 'Accept: application/yaml' -d  @deployment https://appmanager:8080

Note

Note that you have to adjust the state.savepoints.dir entry of the flinkConfiguration map in order to make savepoint-specific features work, e.g., suspending a deployment or triggering a savepoint.