Skip to main content
Create a StreamingDiskANN index on a vector column for high-performance similarity search with optional label-based filtering.
  • Create indexes for fast approximate nearest neighbor search
  • Configure index build-time parameters for optimal performance
  • Enable label-based filtering for precise vector search
  • Choose storage layouts for memory optimization or plain storage
Note that:
  • The index build process can be memory-intensive. Consider increasing maintenance_work_mem:
    SET maintenance_work_mem = '2GB';
    
  • Label values must be within smallint range (-32768 to 32767)
  • Creating indexes on UNLOGGED tables is not currently supported
  • Null vectors are not indexed; null labels are treated as empty arrays

Samples

Basic index creation

CREATE INDEX document_embedding_idx ON document_embedding
USING diskann (embedding vector_cosine_ops);

With custom parameters

CREATE INDEX document_embedding_idx ON document_embedding
USING diskann (embedding vector_cosine_ops)
WITH (num_neighbors = 50, search_list_size = 100);

With label-based filtering

CREATE INDEX document_embedding_idx ON document_embedding
USING diskann (embedding vector_cosine_ops, labels);

With storage layout

CREATE INDEX document_embedding_idx ON document_embedding
USING diskann (embedding vector_cosine_ops)
WITH (storage_layout = 'memory_optimized');

Syntax

CREATE INDEX index_name ON table_name
USING diskann (embedding_column distance_ops [, labels_column])
[WITH (parameter = value, ...)];

Distance operators

OperatorDescriptionUse with
vector_cosine_opsCosine distance (<=>)Normalized embeddings
vector_l2_opsL2 distance (<->)Euclidean distance
vector_ip_opsInner product (<#>)Dot product (not compatible with plain storage)

Build-time parameters

ParameterTypeDefaultDescription
storage_layouttextmemory_optimizedmemory_optimized uses Statistical Binary Quantization; plain stores uncompressed data
num_neighborsint50Maximum number of neighbors per node. Higher values increase accuracy but slow graph traversal
search_list_sizeint100Search parameter during construction. Higher values improve graph quality but slow index builds
max_alphafloat1.2Alpha parameter in the algorithm. Higher values improve graph quality but slow index builds
num_dimensionsint0 (all dimensions)Number of dimensions to index. Useful for Matryoshka embeddings
num_bits_per_dimensionint2 (if less than 900 dims), 1 otherwiseBits per dimension when using Statistical Binary Quantization