Skip to main content

AI SQL Functions

VERA Engine 4.5 introduces AI SQL functions that let you call large language models (LLMs) directly from SQL. These functions use the LATERAL TABLE syntax and require a model created with CREATE TEMPORARY MODEL.

Supported Version

VERA Engine 4.5 or later.

Prerequisites

Syntax Pattern

All AI SQL functions follow this pattern:

SELECT col1, result_col
FROM source_table,
LATERAL TABLE(AI_FUNCTION_NAME(
MODEL => model_ref,
INPUT => input_col
[, additional_parameters]
)) AS T(result_col [, ...]);

AI_CLASSIFY

Classifies input text into one of the specified labels.

Syntax:

LATERAL TABLE(AI_CLASSIFY(
MODEL => model_ref,
INPUT => text_col,
LABELS => ARRAY['label1', 'label2', ...]
)) AS T(category, confidence)

Parameters:

ParameterTypeDescription
MODELModel referenceThe model to use for classification.
INPUTSTRINGThe text to classify.
LABELSARRAY<STRING>The set of labels to classify into.

Output columns:

ColumnTypeDescription
categorySTRINGThe predicted label.
confidenceDOUBLEConfidence score between 0 and 1.

Example:

SELECT content, category, confidence
FROM articles,
LATERAL TABLE(AI_CLASSIFY(
MODEL => my_model,
INPUT => content,
LABELS => ARRAY['sports', 'politics', 'technology']
)) AS T(category, confidence);

AI_SENTIMENT

Analyzes the sentiment of input text.

Syntax:

LATERAL TABLE(AI_SENTIMENT(
MODEL => model_ref,
INPUT => text_col
)) AS T(score, label, confidence)

Parameters:

ParameterTypeDescription
MODELModel referenceThe model to use for sentiment analysis.
INPUTSTRINGThe text to analyze.

Output columns:

ColumnTypeDescription
scoreDOUBLESentiment score.
labelSTRINGSentiment label (for example, positive, negative, neutral).
confidenceDOUBLEConfidence score between 0 and 1.

Example:

SELECT review_text, label, confidence
FROM reviews,
LATERAL TABLE(AI_SENTIMENT(
MODEL => my_model,
INPUT => review_text
)) AS T(score, label, confidence);

AI_EXTRACT

Extracts structured fields from input text based on a JSON schema.

Syntax:

LATERAL TABLE(AI_EXTRACT(
MODEL => model_ref,
INPUT => text_col,
EXTRACT_SCHEMA => '{"field": "type", ...}'
)) AS T(extracted_json)

Parameters:

ParameterTypeDescription
MODELModel referenceThe model to use for extraction.
INPUTSTRINGThe text to extract from.
EXTRACT_SCHEMASTRINGA JSON object defining field names and expected types.

Output columns:

ColumnTypeDescription
extracted_jsonSTRINGA JSON string containing the extracted fields.

Example:

SELECT body, extracted_json
FROM emails,
LATERAL TABLE(AI_EXTRACT(
MODEL => my_model,
INPUT => body,
EXTRACT_SCHEMA => '{"sender_name": "string", "order_id": "string"}'
)) AS T(extracted_json);

AI_SUMMARIZE

Summarizes input text.

Syntax:

LATERAL TABLE(AI_SUMMARIZE(
MODEL => model_ref,
INPUT => text_col
[, MAX_LENGTH => max_tokens]
)) AS T(summary)

Parameters:

ParameterTypeRequiredDescription
MODELModel referenceYesThe model to use for summarization.
INPUTSTRINGYesThe text to summarize.
MAX_LENGTHINTNoMaximum length of the summary in tokens.

Output columns:

ColumnTypeDescription
summarySTRINGThe generated summary.
note

The language of the summary depends on the model's default behavior. To control the output language, include a language instruction in your input, for example CONCAT('Summarize in English: ', text_col).

Example:

SELECT article_id, summary
FROM articles,
LATERAL TABLE(AI_SUMMARIZE(
MODEL => my_model,
INPUT => CONCAT('Summarize in English: ', content),
MAX_LENGTH => 100
)) AS T(summary);

AI_EMBED

Generates a vector embedding for input text.

Syntax:

LATERAL TABLE(AI_EMBED(
MODEL => model_ref,
INPUT => text_col
)) AS T(embedding)

Parameters:

ParameterTypeDescription
MODELModel referenceThe embedding model to use.
INPUTSTRINGThe text to embed.

Output columns:

ColumnTypeDescription
embeddingARRAY<FLOAT>The vector embedding of the input text.
note

AI_EMBED requires a model configured for embeddings, not completions. In your CREATE TEMPORARY MODEL statement, use an embedding model endpoint.

Example:

SELECT doc_id, embedding
FROM documents,
LATERAL TABLE(AI_EMBED(
MODEL => my_embed_model,
INPUT => content
)) AS T(embedding);

AI_TRANSLATE

Translates input text from a source language to a target language.

Syntax:

LATERAL TABLE(AI_TRANSLATE(
MODEL => model_ref,
INPUT => text_col,
SOURCE_LANG => source_language,
TARGET_LANG => target_language
)) AS T(translated_text, detected_language)

Parameters:

ParameterTypeDescription
MODELModel referenceThe model to use for translation.
INPUTSTRINGThe text to translate.
SOURCE_LANGSTRINGSource language code (for example, 'en'). Use 'auto' to detect the language automatically.
TARGET_LANGSTRINGTarget language code (for example, 'fr').

Output columns:

ColumnTypeDescription
translated_textSTRINGThe translated text.
detected_languageSTRINGThe detected source language. Populated when SOURCE_LANG is 'auto'.

Example:

SELECT message_id, translated_text, detected_language
FROM support_messages,
LATERAL TABLE(AI_TRANSLATE(
MODEL => my_model,
INPUT => message_body,
SOURCE_LANG => 'auto',
TARGET_LANG => 'en'
)) AS T(translated_text, detected_language);

AI_MASK

Masks personally identifiable information (PII) or other specified entities in input text.

Syntax:

LATERAL TABLE(AI_MASK(
MODEL => model_ref,
INPUT => text_col,
MASK_ENTITIES => ARRAY['entity1', 'entity2', ...]
)) AS T(masked_text, detected_entities)

Parameters:

ParameterTypeDescription
MODELModel referenceThe model to use for masking.
INPUTSTRINGThe text to mask.
MASK_ENTITIESARRAY<STRING>List of entity types to mask (for example, 'name', 'email', 'phone').

Output columns:

ColumnTypeDescription
masked_textSTRINGThe input text with specified entities replaced by placeholders.
detected_entitiesSTRINGA JSON string listing the detected entities and their positions.

Example:

SELECT record_id, masked_text
FROM customer_records,
LATERAL TABLE(AI_MASK(
MODEL => my_model,
INPUT => record_text,
MASK_ENTITIES => ARRAY['name', 'email', 'phone']
)) AS T(masked_text, detected_entities);

Limitations

  • Requires VERA Engine 4.5 or later.
  • All AI SQL functions use LATERAL TABLE syntax and cannot be used as scalar expressions.
  • AI_EMBED requires an embedding model; all other functions require a completions model.
  • Latency and throughput depend on the model provider's API response time.