Deployment Templates¶
Deployment Templates are specified as part of Deployment resources in spec.template
. Templates are translated into Job resources when you set the desired state of a Deployment to running.
You can think of the relation between the Deployment spec and the template as follows:
- The template specifies which Flink job is executed and how to execute it, including its configuration.
- The Deployment spec defines how Job instances are managed over time, for instance how to execute upgrades or which Savepoint to restore from.
The template has two parts, metadata
and spec
.
* The metadata section currently only accepts optional annotations.
Annotations are key/value pairs used to provide additional information or configuration options.
* The spec specifies which artifact to execute and how to execute it.
kind: Deployment
spec:
template:
metadata:
...
spec:
...
At the end of this page, you will find a full example that creates a deployment for a running Flink example job. The following sections will break down each part of the template in more detail.
The following sections are not valid on their own and should be pasted in under the spec section of the template Deployment.spec.template.spec
.
Artifacts¶
The artifact section of the template specifies which artifact to execute. Currently, there is a single kind of supported artifact, jar.
JAR Artifacts¶
JAR artifacts must package regular Flink jobs that are executed via the main method of their entry class.
artifact:
kind: jar
jarUri: https://artifacts/flink-job.jar
entryClass: com.daplatform.myjob.EntryPoint # Optional, if no Manifest entry
mainArgs: --arg1=1 --arg2=2 # Optional
Application Manager needs to be able to access the artifact and will submit it to the created job cluster for execution. Application Manager will download the JAR artifact, and then upload it to the Flink cluster. The container or system running Application Manager needs to have access to the JAR location.
Note
Check out the Advanced Configuration page if your artifact storage uses TLS and serves a certificate signed by a non-public CA.
Note
Application Manager does not store the JAR artifacts permanently. If you want to be able to go back to earlier versions of your Flink job, ensure that you are versioning the JARs properly, and that they are stored for the time you want to be able to go back.
Custom Docker Images¶
By default, containers created for a Deployment will use the configured default Flink Docker images. You can overwrite this on a per Deployment basis as part of the artifact.
artifact:
kind: jar
jarUri: https://artifacts/flink-job.jar
flinkVersion: 1.6
flinkImageRegistry: registry.platform.data-artisans.net
flinkImageRepository: v1.3/flink
flinkImageTag: 1.6.0-dap1-scala_2.11
Each flink* attribute is optional. If not provided, it will fall back to the configured default of your Application Manager installation.
You can specify the Flink image by digest if you prefix the flinkImageTag
attribute with @
, for instance flinkImageTag: @sha256:...
.
Flink Version¶
The flinkVersion
attribute specifies which Flink version the artifact should be executed with. If you don’t specify an explicit version, Application Manager will default to the latest supported version.
In addition, the provided flinkVersion
will be used to pick the default Flink image tag, e.g. 1.6.0-dap1-scala_2.11
for flinkVersion: 1.6
if no image tag is specified manually.
Note
Note that the specified flinkVersion
and the Flink version of the deployed Docker image must match. When upgrading the flinkVersion
of a Deployment make sure to also update the flinkImageTag
accordingly.
Flink Configuration (flink-conf.yaml)¶
The Flink configuration flink-conf.yaml passed to a created Flink cluster is generated from the flinkConfiguration
map. Each map entry will be added to the Flink configuration as is.
flinkConfiguration:
key1: value1
key2: value2
Flink options that are required for Application Manager to function correctly (such as certain ports) cannot be overwritten and will be added automatically.
Note
Note that you have to adjust the state.savepoints.dir
entry of the flinkConfiguration
map in order to enable savepoint-specific features, e.g., suspending a deployment or triggering a savepoint.
Parallelism, Number of TaskManagers, and Slots¶
You can specify the parallelism of your jobs via the parallelism
key. By default, there will be as many TaskManager instances created as the specified parallelism. You can overwrite this behavior via the numberOfTaskManagers
key.
parallelism: 1
numberOfTaskManagers: 1 # Optional, defaults to parallelism
Each TaskManager will have Flink’s default setting for number of task slots (currently a single slot per TaskManager). Combining the configuration of parallelism and numberOfTaskManagers with the taskmanager.numberOfTaskSlots
option of Flink gives you full flexibility in how to execute your jobs.
In the following snippet, we specify two TaskManager instances with 4 slots each and a parallelism of 8 for our jobs.
parallelism: 8
numberOfTaskManagers: 2
flinkConfiguration:
taskmanager.numberOfTaskSlots: 4
For more details on Flink task slots, consult the official Flink documentation.
Compute Resources¶
You can configure requested compute resources for CPU and memory via the resources
:
resources:
jobmanager:
cpu: 1
memory: 1G
taskmanager:
cpu: 1
memory: 2G
The keys jobmanager
and taskmanager
configure the respective Flink components. By default, the above values are used as defaults. You can overwrite each value selectively.
Note that resources are configured per component instance, e.g. if you use 10 TaskManager instances, Application Manager will in total try to acquire 10 times the configured taskmanager resources.
Memory¶
You can specify memory as a decimal number with an optional memory unit as indicated by the suffix (ignoring case):
- Gigabytes
G
, e.g. 1G or 1.5G - Megabytes
M
, e.g. 1000M or 1500M
If no unit is specified, the provided number is interpreted as bytes. Each memory unit is interpreted as a power of ten, e.g. 1K equals 1000 bytes.
Note
The CPU resources you configure for your deployments are counted against your licensed resource quota (CPU Quota).
JVM Heap Size
Application Manager will respect the container memory cut-offs specified in flinkConfiguration
similar to Apache Flink’s behavior.
The following table lists the default values of the minimum memory cutoff and the cutoff ratio for JobManager and TaskManager instances, respectively. Addtionally, you will find the configuration keys to configure the values.
Minimum cutoff | Cutoff ratio | |
---|---|---|
JobManager | 400 M | 25 % |
jobmanager.heap-cutoff-min: 400 |
jobmanager.heap-cutoff-ratio: 0.25 |
|
TaskManager | 600 M | 25 % |
containerized.heap-cutoff-min: 600 |
containerized.heap-cutoff-ratio: 0.25 |
As an example, take a memory request of 2G for a TaskManager. This means that 600 M will be cut off from the configured heap (as 25% of 2G is less than 600 M).
Logging¶
You can customize the default log4j logger configuration by setting log levels for your desired loggers.
logging:
log4jLoggers:
"": INFO # Root log level
com.company: DEBUG # Log level of com.company
The default log level for org.apache.flink
is INFO
.
Kubernetes Options¶
It is possible to specify Kubernetes-specific options for a Deployment. These options will be forwarded to the pods created by Application Manager.
Please check out the Configure Kubernetes for more details.
Full Example¶
kind: Deployment
apiVersion: v1
metadata:
name: TopSpeedWindowing Example
labels:
env: testing
spec:
state: running
deploymentTargetId: 57b4c290-73ad-11e7-8cf7-a6006ad3dba0
restoreStrategy:
kind: latest_savepoint
upgradeStrategy:
kind: stateless
template:
annotations:
flink.security.ssl.enabled: false
spec:
artifact:
kind: jar
jarUri: http://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.11/1.6.0/flink-examples-streaming_2.11-1.6.0-TopSpeedWindowing.jar
mainArgs: --windowSize 10 --windowUnit minutes
entryClass: org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
parallelism: 1
numberOfTaskManagers: 2
resources:
taskManager:
memory: 1.5g
flinkConfiguration:
taskmanager.numberOfTaskSlots: 1
state.savepoints.dir: s3://flink/savepoints
logging:
log4jLoggers:
org.apache.flink.streaming.examples: DEBUG
You can copy-paste this Deployment resource into a file called deployment and post it via curl:
$ curl \
-H 'Content-Type: application/yaml' \
-H 'Accept: application/yaml' \
-d @deployment https://appmanager:8080