Skip to main content
Generate vector embeddings from text, text arrays, or tokens using OpenAI’s embedding models. Embeddings are numerical representations of text that capture semantic meaning, making them ideal for semantic search, recommendations, and clustering.

Samples

Generate an embedding from text

Create a vector embedding for a single piece of text:
SELECT ai.openai_embed(
    'text-embedding-ada-002',
    'PostgreSQL is a powerful database'
);

Generate embeddings for multiple texts

Process multiple texts at once for efficiency:
SELECT ai.openai_embed(
    'text-embedding-ada-002',
    array[
        'PostgreSQL is a powerful database',
        'TimescaleDB extends PostgreSQL for time-series',
        'pgai brings AI capabilities to PostgreSQL'
    ]
);

Specify embedding dimensions

Control the size of the output vector (model-dependent):
SELECT ai.openai_embed(
    'text-embedding-3-small',
    'PostgreSQL is a powerful database',
    dimensions => 768
);

Use pre-tokenized input

Provide tokens directly instead of text:
SELECT ai.openai_embed(
    'text-embedding-ada-002',
    array[1820, 25977, 46840, 23874, 389, 264, 2579, 58466]
);

Store embeddings in a table

Generate and store embeddings for your data:
UPDATE documents
SET embedding = ai.openai_embed(
    'text-embedding-ada-002',
    content
)
WHERE embedding IS NULL;

Arguments

NameTypeDefaultRequiredDescription
modelTEXT-The OpenAI embedding model to use (e.g., text-embedding-ada-002, text-embedding-3-small)
input_textTEXT-Single text input to embed (use this OR input_texts OR input_tokens)
input_textsTEXT[]-Array of text inputs to embed in a batch
input_tokensINT[]-Pre-tokenized input as an array of token IDs
api_keyTEXTNULLOpenAI API key. If not provided, uses ai.openai_api_key setting
api_key_nameTEXTNULLName of the secret containing the API key
dimensionsINTNULLNumber of dimensions for the output embedding (only supported by some models)
openai_userTEXTNULLUnique identifier for the end-user for abuse monitoring
encoding_formatTEXTNULLFormat for the embeddings (float or base64)
extra_headersJSONBNULLAdditional HTTP headers to include in the API request
extra_queryJSONBNULLAdditional query parameters for the API request
verboseBOOLEANFALSEEnable verbose logging for debugging
client_configJSONBNULLAdvanced client configuration options

Returns

For single text input:
  • vector: A pgvector compatible vector containing the embedding
For array input:
  • TABLE(index INT, embedding vector): A table with an index and embedding for each input text