Ververica Unified Streaming Data Platform Product Updates
Product Updates are organized by release date and let you know what’s new, what’s improved, and what’s fixed in that release.
Every entry within a Product Update follows a consistent structure:
- What's in this update - a self-explanatory heading that describes the individual update.
- What it means to you - a description that explains the impact of the changes and why it matters.
- Where it lives - identifies what part or component of the tech layer the change applies to.
- Where you can use it - identifies the supported deployments to make it easy to know if the feature is available in your environment.
- Where to learn more – links to relevant documentation, examples, or tutorials.
This structure helps you quickly see what changed, why it matters, and where it applies, without needing to understand internal versioning or engineering details. For detailed engineering information, refer to the ChangeLogs.
Release: November 7, 2025
Compute Engine: VERA 4.1
Built-on: Apache Flink® 1.20
Apache Flink 1.20 provides the underlying execution environment for both streaming and batch jobs, handling scheduling, state management, fault tolerance, and distributed processing. On top of this, the VERA 4.1 Compute Engine — Ververica’s proprietary distribution and management layer for Flink — simplifies operations and adds management tools, connectors, and integrations, while Flink continues to perform the actual computation.
Overview
In this release, the Ververica Unified Streaming Data Platform delivers new capabilities and major features across batch and streaming workloads.
Key updates focus on features that impact users directly, including:
- AI-powered integrations that result in real-time insights with LLM and Retrieval-Augmented Generation (RAG) on streaming data.
- Resource management improvements that allow fine-grained scheduling, autopilot tuning, and dynamic parameter updates for efficient, reliable workloads.
- YAML CDC capabilities allow declaratively defining CDC jobs with YAML configuration to simplify job management, making them human readable and understandable to anyone on the team (engineers, data analysts, DevOps).
New Features
This release introduces capabilities that directly impact how you work with the Ververica Unified Streaming Data Platform, improving efficiency, performance, and usability across batch and streaming workloads.
Summary
| Feature | Description | Self-Managed (On-Prem) | BYOC | Managed Cloud |
|---|---|---|---|---|
| Batch Mode | Unified execution for batch workloads | ❌ | ✅ | ✅ |
| Real-Time RAG & AI | Real-time AI and LLM integration | ✅ | ✅ | ✅ |
| YAML CDC | Declarative CDC configuration | ✅ | ✅ | ✅ |
| Autopilot | Intelligent and scheduled job tuning | ✅ | ✅ | ✅ |
| Dynamic Parameter Update | Hot updates without restart | ✅ | ✅ | ✅ |
| Custom Partitioner | Advanced dimension-table joins | ✅ | ✅ | ✅ |
| Named Parameters in UDFs | Flexible SQL function calls | ✅ | ✅ | ✅ |
| New Connectors & Catalogs | New and enhanced connectors (Oracle, PostgreSQL, Milvus, Fluss, Paimon) and catalog integrations for seamless data access and schema management | ✅ | ✅ | ✅ |
| Resource Queue Management | Stability, fairness, and optimized resource utilization across mixed batch and streaming workloads running under the same workspace | ❌ | ✅ | ✅ |
Run Large-Scale Batch Jobs
You can now run large-scale, finite batch jobs efficiently using SQL, Python, or JARs — all within a unified Flink execution environment. Integrates seamlessly with catalogs for consistent schema and metadata management.
Benefits:
- Supports end-to-end orchestration through resource queues, ensuring smooth coordination across complex data pipelines.
- Reduces operational overhead
- Provides consistent schema and metadata handling
- Is ideal for analytics, historical backfills, and hybrid workloads that demand both performance and reliability.
Supported Deployments: BYOC, Managed Cloud
Learn More: Batch Mode
Integrate with OpenAI Models
Ververica Unified Streaming Data Platform integrates seamlessly with OpenAI models to bring advanced language intelligence into your data pipelines. Empowers real-time Retrieval-Augmented Generation (RAG) directly on streaming data, enabling dynamic, context-aware insights as events happen. Builds intelligent, adaptive pipelines that combine live data processing with AI-driven inference for smarter, faster decision-making.
Benefits:
- Performs data processing, model inference, and information retrieval within a single unified platform-significantly enhancing efficiency and simplifying workflows
- Supports calling AI remote models for text inference and real-time RAG embedding construction within Flink SQL without additional programming and configuration.
Supported Deployments: BYOC, Managed Cloud, Self-Managed (On-Prem)
Learn More: Real-Time AI and RAG with Ververica
Manage CDC Jobs
You can declaratively define CDC jobs with YAML configuration, so anyone on the team (engineers, data analysts, DevOps) can understand or update a job without needing to be a Flink expert. The feature helps reduce misconfigurations and saves time when scaling CDC across many tables or tenants, encourages clean architecture and division of responsibilities.
Benefits:
- Simplifies job management (human-readable, standardized)
- Reuses configuration across environments
- Allows Git-based version control and CI/CD integration
- Isolates wnvironments with templating (e.g., Helm)
- Supports validation and linting tools
- Enables GitOps workflows for CDC jobs
Supported Deployments: BYOC, Managed Cloud, Self-Managed (On-Prem)
Learn More: Data Ingestion
Autopilot 2.0: Automatic Resource Tuning
Autopilot is an automatic tuning feature that helps simplify adjusting resources to achieve optimal utilization. You can select the appropriate tuning mode:
- Intelligent Tuning: In this mode, the system automatically reduces resource allocation when usage is low and increases it when usage rises above a certain threshold..
- Scheduled Tuning: defines how resources are allocated based on specific time periods (e.g., peak/off-peak). It can include multiple sets of time-resource mappings.
Benefits:
- Optimizes job throughput and latency
- Improves efficient resource utilisation
- Provides global tuning for performance and cost efficiency
Supported Deployments: BYOC, Managed Cloud, Self-Managed (On-Prem)
Learn More: Tune Performance
Update Parameters Dynamically (Hot Update)
Dynamic update of the parameter configuration of Apache Flink deployment can make parameter configurations take effect more quickly. This helps reduce business downtime caused by deployment startup and cancellation and facilitates dynamic scaling of TaskManagers and checkpoint-based troubleshooting.
Benefits:
- Reduces downtime
- Simplifies job maintenance
Supported Deployments: BYOC, Managed Cloud, Self-Managed (On-Prem)
Learn More: Dynamic parameter updates
Customize Cache Policies
Most connectors allow you to specify the cache policy for the JOIN operations on dimension tables. Different connectors support different cache policies. For more information, see the related connector documentation.
Benefits: Optimizes lookup joins on small or frequently queried dimension tables.
Supported Deployments: BYOC, Managed Cloud, Self-Managed (On-Prem)
Learn More: Dimension Table Joins
Use Named Parameters in UDFs
You can use named arguments for SQL UDFs. In SQL, parameter values must be passed to a function in the exact order in which the parameters are defined, and optional parameters cannot be omitted. When a function has many parameters, maintaining the correct order can be difficult. To improve usability and avoid errors, you can use named parameters to specify only the parameters you need.
Benefits:
- Improves readability and safety
- Simplifies development and maintainability of the code.
Supported Deployments: BYOC, Managed Cloud, Self-Managed (On-Prem)
Learn More: Named Parameters in User-Defined Functions (UDFs)
Manage Resources Across Mixed Workloads
This release addresses performance and resource allocation challenges in streaming lakehouse scenarios with a fine-grained resource scheduling system. This feature ensures stability, fairness, and optimized resource utilization across mixed batch and streaming workloads running under the same workspace.
Benefits:
- Resource Isolation. Separate tasks and users into distinct queues to prevent interference, avoiding performance drops or failures caused by competing workloads.
- Resource Limitation. Define resource quotas for queues to ensure fair sharing and prevent single workloads from consuming excessive compute or memory.
- Resource Scheduling. Automated scheduling of task execution and allocation based on queue priorities and workload demand.
Supported Deployments: BYOC, Managed Cloud
Learn More: Resource Allocation
Connectors & Catalogs
This release expands ecosystem connectivity with new and enhanced connectors and catalogs, making it easier to integrate Flink jobs with enterprise databases, streaming systems, and AI workloads.
New Integrations
- Oracle Connector & Catalog: Integrated Oracle Database support for both catalog and connector components. The Oracle connector enables reading from and writing to Oracle databases in both streaming and batch modes. Built as an extension of the JDBC connector, it inherits key configuration options and capabilities while adding Oracle-specific optimizations and encryption support.
- Milvus Connector: Added a new connector for Milvus, enabling Flink SQL jobs to store vector and scalar data with upsert support for AI and similarity search workloads.
- Fluss Connector & Catalog: Introduced Fluss integration for Apache Flink. The Fluss Catalog provides SQL-level support with a full mapping between Fluss and Flink tables, enabling seamless metadata management. It supports both source and sink functionality with ordered, durable event streams, configurable parallelism, offset management, and strong delivery guarantees.
- Protobuf Format: Introduced support for Protobuf serialization format as requested by enterprise customers.
Updates / Enhancements
- Kafka Catalog: Added integration with Confluent Schema Registry and AWS Glue Schema Registry for managing and validating Avro schemas. These registries ensure producers and consumers share consistent data structures, support schema evolution, and enable compatibility checks. Flink now supports direct interaction with both registries through the Kafka Schema Registry Catalog, allowing schema and subject metadata management from Flink SQL.
- PostgreSQL Connector & Catalog: Bundled PostgreSQL JDBC connector and driver, enabling Flink SQL jobs on Ververica to connect to PostgreSQL databases. Supports three access modes: the Postgres connector (for reading, writing, and lookups using simple JDBC-style configs), the Postgres catalog (for instant access to existing tables with zero-DDL), and CDC (for continuous streaming). No extra JARs needed, Ververica includes the required drivers and supports key features like parallel reads, sink buffering, and lookup caching.
Learn More:
Security Fixes
This release addresses several critical CVEs and removes vulnerable dependencies to enhance platform security.
- Denial of Service via poisoned logback receiver data.
- Deserialization of untrusted data may allow remote code execution.
- Deserialization vulnerability in Jackson libraries.
- Stack-based buffer overflow due to deeply nested JSON input.
- DoS from excessive memory usage during decompression.
- Updated dependencies to reduce dependency surface.
Bug Fixes
This release resolves the following issues:
-
Classloading and deployment issues for both catalog-level and deployment-level UDFs. You can now reliably create and deploy User-Defined Functions (UDFs) without errors caused by classloading conflicts. Previously, a UDF might fail to load or behave inconsistently across different environments (catalog-level vs deployment-level). This ensures smoother, more predictable execution of custom code.
-
Paimon Connector: Compatibility with Azure BlobStore. If you store or retrieve data in Azure BlobStore via the Paimon connector, you can now do so without errors or incompatibility issues. This improves stability and reliability when integrating with Azure cloud storage, reducing data pipeline failures.