Skip to main content

ML_PREDICT

This topic describes how to use the ML_PREDICT function for AI inference in your Flink jobs.

Limitations

  • Supported in VERA Engine 4.1 or later.
  • The throughput of ML_PREDICT operators is subject to the rate limits of your model service provider. When rate limits are reached, the Flink job may experience backpressure at the ML_PREDICT operators, which can lead to timeout errors or job restarts.

Syntax

ML_PREDICT(TABLE table_name, MODEL model_name, DESCRIPTOR(input_column_names))

Input Parameters

ParameterData typeDescription
TABLE table_nameTABLEThe input stream for model inference. This can be a physical table or a view.
MODEL model_nameMODELThe name of the registered model.
DESCRIPTOR(input_column_names)N/AThe columns in the input data used for model inference.

Example

The following example registers a sentiment analysis model and uses it to predict sentiment categories for movie reviews.

1. Register the Model

CREATE TEMPORARY MODEL ai_analyze_sentiment
INPUT (input STRING)
OUTPUT (content STRING)
WITH (
'provider' = 'openai',
'endpoint' = '<your-endpoint>',
'apiKey' = '<your-key>',
'model' = 'gpt-5.1',
'system-prompt' = 'Classify the text below into one of the following labels: [positive, negative, neutral, mixed]. Output only the label.'
);

2. Prepare Test Data

CREATE TEMPORARY VIEW movie_comment(id, movie_name, user_comment, actual_label)
AS VALUES
(1, 'Silent Echo', 'A haunting story that kept me guessing until the end.', 'positive'),
(2, 'The Velvet Gate', 'Nothing special.', 'negative');

3. Run the Prediction

SELECT id, movie_name, content as predict_label, actual_label 
FROM ML_PREDICT(TABLE movie_comment, MODEL ai_analyze_sentiment, DESCRIPTOR(user_comment));

Output Results

The prediction results in the predict_label column match the actual results in the actual_label column.

idmovie_namepredict_labelactual_label
1Silent EchoPOSITIVEpositive
2The Velvet GateNEGATIVEnegative