Skip to main content
A vectorizer provides a powerful and automated way to generate and manage LLM embeddings for your data, keeping them synchronized with your source data automatically.

What is a vectorizer?

A vectorizer automates the entire embedding workflow:
  • Automated embedding generation: Create embeddings for table data automatically
  • Automatic synchronization: Triggers keep embeddings in sync with source data
  • Background processing: Async processing minimizes impact on database operations
  • Scalability: Batch processing handles large datasets efficiently
  • Highly configurable: Customize embedding models, chunking, formatting, indexing, and scheduling

Key features

  • Multiple AI providers: OpenAI, Ollama, Cohere, Voyage AI, and LiteLLM support
  • Efficient storage: Separate tables with appropriate indexing for similarity searches
  • View creation: Automatic views join source data with embeddings
  • Access control: Fine-grained permissions for vectorizer objects
  • Monitoring: Built-in tools to track queue status and performance

Quick start

Create a basic vectorizer

SELECT ai.create_vectorizer(
    'blog.posts'::regclass,
    embedding => ai.embedding_openai('text-embedding-3-small', 768),
    chunking => ai.chunking_character_text_splitter(512)
);

Table destination (separate embeddings table)

SELECT ai.create_vectorizer(
    'website.blog'::regclass,
    destination => ai.destination_table(
        target_table => 'blog_embeddings_store',
        view_name => 'blog_embeddings'
    ),
    loading => ai.loading_column('content'),
    embedding => ai.embedding_ollama('nomic-embed-text', 768),
    chunking => ai.chunking_character_text_splitter(128, 10)
);

Column destination (embedding in source table)

SELECT ai.create_vectorizer(
    'products'::regclass,
    destination => ai.destination_column('description_embedding'),
    loading => ai.loading_column('description'),
    embedding => ai.embedding_openai('text-embedding-3-small', 768),
    chunking => ai.chunking_none()  -- Required for column destination
);

Configuration functions

Core functions

Destination configuration

Loading configuration

Parsing configuration

Chunking configuration

Embedding configuration

Formatting configuration

Indexing configuration

Scheduling configuration

Processing configuration

Access control

Management functions

Monitoring