Getting Started

In this getting started guide, you will install Ververica Platform, integrate it with MinIO for Universal Blob Storage, and deploy your first Apache Flink® application using Ververica Platform.

Setting the Stage

Kubernetes

Ververica Platform runs on top of Kubernetes. In order to get started locally we recommend using minikube, but any other Kubernetes Cluster (1.11+) will do, too.

Minikube relies on virtualization support by your operating system as well as a hypervisor (e.g. Virtualbox). Please check the official installation guide for details.

Minikube on Mac OS (homebrew)

$ brew install kubectl minikube

Minikube on Windows (Chocolatey)

$ choco install kubernetes-cli minikube

Minikube on Linux

There are packages available for most distros and package managers. Please check the kubectl installation guide as well as the minikube installation guide for details.

Spinning up a Kubernetes Cluster

First, you start minikube. The platform (including a small Apache Flink® application) requires at least 8G of memory and 4 CPUs.

$ minikube start --memory=8G --cpus=4

If this went well, you can continue and check if all system pods are ready.

$ kubectl get pods -n kube-system

Depending on your exact version of minikube, the output should look more or less similar to

NAME                               READY   STATUS    RESTARTS  AGE
coredns-5644d7b6d9-56zhg           1/1     Running   1         2m
coredns-5644d7b6d9-fdnts           1/1     Running   1         2m
etcd-minikube                      1/1     Running   1         2m
kube-addon-manager-minikube        1/1     Running   1         2m
kube-apiserver-minikube            1/1     Running   1         2m
kube-controller-manager-minikube   1/1     Running   1         2m
kube-proxy-9w92r                   1/1     Running   1         2m
kube-scheduler-minikube            1/1     Running   1         2m
storage-provisioner                1/1     Running   2         2m

If all pods are running, you are good to go.

Helm

“Helm helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.” - helm.sh

We distribute Ververica Platform as a Helm Chart. To install Helm please follow the instructions on the official installation guide or use one of the one-liners below.

Helm on Mac OS (homebrew)

$ brew install helm

Helm on Windows (Chocolatey)

$ choco install kubernetes-helm

Helm on Linux

As before, there is a package available for most distros and package managers. For details check the official installation guide.

Setting Up Tiller

Tiller is not required for Helm 3. You can skip this step.

Helm 2 requires a server-side component called Tiller. The commands below set up Tiller in the kube-system namespace with the required permissions to install Helm charts in this Kubernetes cluster.

# create service account
$ kubectl --namespace kube-system create serviceaccount tiller

# bind service account to "cluster-admin" role
$ kubectl create clusterrolebinding tiller \
    --clusterrole cluster-admin \
    --serviceaccount=kube-system:tiller

# initialize helm with previously created service account
$ helm init --service-account tiller

Please wait until the Tiller pod becomes ready before proceeding. The command helm list should return an empty list without any errors.

Setting Up the Playground

This guide is based on the Ververica Platform playground repository which contains scripts and Helm values files to make for a smooth getting-started experience. Please clone the repository before continuing; all commands below are meant to be executed from the repository root directory.

$ git clone --branch release-2.2 https://github.com/ververica/ververica-platform-playground.git
$ cd ververica-platform-playground

Anatomy of this Playground

For this playground, you will create two Kubernetes namespaces: vvp and vvp-jobs. vvp will host the control plane of Ververica Platform and other services, while the Apache Flink® jobs managed by the platform will run in the vvp-jobs namespace.

In addition to Ververica Platform, we will set up MinIO in the vvp namespace, which will be used for artifact storage and Apache Flink® checkpoints & savepoints (see Universal Blob Storage).

Artifacts

Installing the Components

TL;DR

You can skip all of the installation steps outlined below by running:

$ ./setup.sh --edition community
$ ./setup.sh --edition enterprise

Kubernetes Namespaces

Before installing any of the components you need to create the Kubernetes namespaces vvp and vvp-jobs.

$ kubectl create namespace vvp
$ kubectl create namespace vvp-jobs

MinIO

Install MinIO with Helm, using the official Helm chart from the stable repository.

If you have never added the stable Helm repository, do this now:

$ helm repo add stable https://kubernetes-charts.storage.googleapis.com

Then install MinIO with:

$ helm --namespace vvp \
    install minio stable/minio \
    --values values-minio.yaml

The stable repository is already preconfigured.

$ helm install stable/minio \
    --name minio \
    --namespace vvp \
    --values values-minio.yaml

Ververica Platform

Then, install Ververica Platform using helm. The required configurations slightly differ based on the product edition you would like to install.

Ververica Platform Community Edition

$ helm repo add ververica https://charts.ververica.com
$ helm --namespace vvp \
    install vvp ververica/ververica-platform \
    --values values-vvp.yaml

When running the command above you will be asked to accept the Ververica Platform Community Edition license agreement. Please read it carefully and except it by setting acceptCommunityEditionLicense to true:

$ helm --namespace vvp \
    install vvp ververica/ververica-platform \
    --values values-vvp.yaml \
    --set acceptCommunityEditionLicense=true
$ helm repo add ververica https://charts.ververica.com
$ helm install ververica/ververica-platform \
    --name vvp \
    --namespace vvp \
    --values values-vvp.yaml

When running the command above you will be asked to accept the Ververica Platform Community Edition license agreement. Please read it carefully and except it by setting acceptCommunityEditionLicense to true:

$ helm install ververica/ververica-platform \
    --name vvp \
    --namespace vvp \
    --values values-vvp.yaml \
    --set acceptCommunityEditionLicense=true

Before you can run Ververica Platform Stream Edition, you must add your license to a values file values-license.yaml under vvp.license.data. If you do not have a license yet, you can request a 30 day free trial license from the Ververica website.

The values-license.yaml file should look similar to:

### Provide Ververica Platform License (free trial: ververica.com/enterprise-trial)
vvp:
  license:
    data: {
      "kind": "License",
      "apiVersion": "v1",
      "metadata": {
        "id": "53b8cf22-1af2-44bd-a7ba-7420418f6572",
        "createdAt": "2020-02-21T12:56:52.407899Z",
        "annotations": {
          "signature": "<omitted>",
          "licenseSpec": "ewogICJsaWNlbnNlSWQiIDogIjUzYjhjZjIyLTFhZjItNDRiZC1hN2JhLTc0MjA0MThmNjU3MiIsCiAgImxpY2Vuc2VkVG8iIDogInRlc3QiLAogICJleHBpcmVzIiA6ICIyMDIwLTAzLTIyVDEyOjU2OjUxLjg3MzU1M1oiLAogICJwYXJhbXMiIDogewogICAgInF1b3RhLnR5cGUiIDogIlVOTElNSVRFRCIsCiAgICAidHJpYWwiIDogInRydWUiCiAgfQp9"
        }
      },
      "spec": {
        "licenseId": "53b8cf22-1af2-44bd-a7ba-7420418f6572",
        "licensedTo": "My Company Inc.",
        "expires": "2020-03-22T12:56:51.873553Z",
        "params": {
          "quota.type": "UNLIMITED",
          "trial": "true"
        }
      }
    }
$ helm repo add ververica https://charts.ververica.com
$ helm install vvp ververica/ververica-platform \
  --namespace vvp \
  --values values-vvp.yaml \
  --values values-license.yaml
$ helm repo add ververica https://charts.ververica.com
$ helm install ververica/ververica-platform \
  --name vvp \
  --namespace vvp \
  --values values-vvp.yaml \
  --values values-license.yaml

In order to access the web user interface or the REST API set up a port forward to the Ververica Platform Kubernetes service:

$ kubectl --namespace vvp port-forward services/vvp-ververica-platform 8080:80

The web interface and API are both now available under localhost:8080. The UI will show that you do not have any Deployments yet.

No Deployments Yet

Creating your First Deployment

Deployments are the core resource to manage Apache Flink® jobs within Ververica Platform. A Deployment specifies the desired state of an application and its configuration. At the same time, Ververica Platform tracks and reports each Deployment’s status and derives other resources from it. Whenever the Deployment specification is modified, Ververica Platform will ensure the running application will eventually reflect this change.

Before you create your first Deployment you need to create a Deployment Target. A Deployment Target links a Deployment to a Kubernetes namespace, which your Flink applications will be deployed into. In this case you can use the vvp-jobs namespace, that we created earlier.

Choose Deployment Targets in the left side-bar and click Add Deployment Target. Give it the name vvp-jobs and point it to the vvp-jobs namespace.

Adding Deployment Target

POST vvp-resources/deployment_target.yaml to Ververica Platform REST API to create the Deployment Target:

$ curl localhost:8080/api/v1/namespaces/default/deployment-targets \
    -H "Content-Type: application/yaml" \
    -H "Accept: application/yaml" \
    --data-binary @vvp-resources/deployment_target.yaml

Now, you can create your first Deployment.

Choose + Create Deployment in the left sidebar. For the first Deployment we recommend to use the Standard view.

Standard Form
  • Name: provide a name such as Top Speed Windowing

  • Deployment Target: select the Deployment Target that you just created.

  • Parallelism: set the parallelism to 1

  • Jar URI: provide a URI to the JAR containing your Flink program. If you do not have have an artifact at hand, you can use

    https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.11.2/flink-examples-streaming_2.12-1.11.2-TopSpeedWindowing.jar.

    To provide your own JAR you switch to the Artifacts screen and upload your artifact directly to the platform, or you point Ververica Platform to an externally stored artifact via http(s) as in the example above.

    Artifacts

Finally, you click Create Deployment to start your Apache Flink® application.

Replace the value of the spec.deploymentTargetId in vvp-resources/deployment.yaml by the value of the metadata.id field of the Deployment Target that you just created. You can use the following GET request to get a list of all Deployment Targets:

$ curl localhost:8080/api/v1/namespaces/default/deployment-targets \
    -H "Accept: application/yaml"

Afterwards, you can POST vvp-resources/deployment.yaml to Ververica Platform REST API to create the Deployment:

$ curl localhost:8080/api/v1/namespaces/default/deployments \
    -H "Content-Type: application/yaml" \
    -H "Accept: application/yaml" \
    --data-binary @vvp-resources/deployment.yaml

Ververica Platform will now go ahead and create a highly available Flink cluster, which runs your Flink application in vvp-jobs. Checkpointing and Savepoints, as well as Flink master failover, have automatically been configured by the platform.

Once the Deployment has reached the RUNNING state you can also checkout the Flink UI.

Deployment Overview

One of the core features of Ververica Platform is application lifecycle management for stateful stream processing applications. As part of this, Ververica Platform takes care of migrating your distributed state consistently when you make changes to your application. For example, you can change your Deployment by changing the parallelism, i.e. rescaling your Flink job.

In the Deployment overview page, click Configure Deployment, change the parallelism to 2, and click Save Changes.

Standard Form

Change the value of spec.template.parallelism in vvp-resources/deployment.yaml to 2. Then PATCH the existing Deployment with the changed resource. For this, you need the metadata.id of your Deployment. You can use the following GET request to list all Deployments:

$ curl localhost:8080/api/v1/namespaces/default/deployments \
    -H "Accept: application/yaml"

Afterwards, you can PATCH your Deployment with the modified version of vvp-resources/deployment.yaml to scale up the Deployment.

$ curl localhost:8080/api/v1/namespaces/default/deployments/{deploymentId} \
    -X PATCH \
    -H "Content-Type: application/yaml" \
    -H "Accept: application/yaml" \
    --data-binary @vvp-resources/deployment.yaml

Under the hood, Ververica Platform now performs an application upgrade according to the configured Upgrade and Restore Strategy. For your Deployment, these default to STATEFUL and LATEST_STATE and Ververica Platform has triggered a graceful shutdown of your Flink application while taking a consistent snapshot of its state via a savepoint. You can see a list of all past savepoints and retained checkpoints for this Deployment in the Snapshots tab. It then restarts your application from the latest snapshot, which is the one taken during shutdown.

Logging and Metrics Integrations

Ververica Platform can be integrated with logging and metrics collection and querying/visualization systems to help monitor and debug your Flink applications.

The setup.sh script included in the playground repository accepts flags --with-logging and --with-metrics that enable additional demo components for logging and metrics respectively.

Note

The --with-logging and --with-metrics flags can be used separately or together, and can be applied after the initial installation simply by running setup.sh again.

  • --with-logging installs Elasticsearch, Fluentd, and Kibana to collect, index, and provide a web interface for querying Flink application logs
  • --with-metrics installs Prometheus, a metrics collection and storage system, and Grafana, a time series visualization web application

This setup uses Global Deployment Defaults to ensure each Flink job is configured to use the built-in Prometheus metrics reporter and that each Kubernetes pod running Flink gets an annotation that makes it discoverable by the Prometheus server.

Metrics

After installing or upgrading the platform using ./setup.sh --with-metrics, run the following command to port-forward Grafana:

$ kubectl --namespace vvp port-forward services/grafana 3000:80

Then, when viewing one of your Deployments in the web UI, click the Metrics button to be linked to a sample monitoring dashboard in Grafana for that Deployment. It may take a few minutes for metrics to appear.

To understand this setup, check out the following files:

  • values-prometheus.yaml: Configuration for the Prometheus Helm chart. This example uses the default configuration except for disabling components we don’t need.
  • values-grafana.yaml: Configuration for the Grafana Helm chart. A datasource and dashboard are preconfigured, and auth is disabled to make for a convenient demonstration.
  • values-vvp.yaml: See the globalDeploymentDefaults section for how the Prometheus metrics reporter and pod annotations are automatically configured for all Deployments.
  • values-vvp-add-metrics.yaml: This enables the Metric button on a Deployment or Job in the web UI that links to Grafana.

Logging

After installing or upgrading the platform using ./setup.sh --with-logging, run the following command to port-forward Kibana:

$ kubectl --namespace vvp port-forward services/kibana 5601:5601

Then, when viewing one of your Deployments in the web UI, click the Logs button to be linked to Kibana with a pre-filled query to only show logs from that Deployment.

To understand this setup, check out the following files:

  • values-{elasticsearch,fluentd,kibana}.yaml: Configuration for the Elasticsearch, Fluentd, and Kibana Helm charts.
  • values-vvp-add-logging.yaml: This enables the Logs button on a Deployment or Job in the web UI that links to Kibana.

Next Steps

Now that you have a Ververica Platform instance and your first Deployment up and running, there are multiple areas you can look into to learn more about the platform.

Cleaning Up

Run the script ./teardown.sh to clean up all applications deployed with Helm created in this tutorial and delete the namespaces created in the first step.

Alternately, do this manually with the following commands:

$ kubectl delete namespace vvp vvp-jobs
$ helm delete --purge vvp
$ helm delete --purge minio

# If installed with --with-metrics
$ helm delete --purge prometheus
$ helm delete --purge grafana

# If installed with --with-logging
$ helm delete --purge kibana
$ helm delete --purge fluentd
$ helm delete --purge elasticsearch

$ kubectl delete namespace vvp vvp-jobs