Skip to main content
Convert text into an array of token IDs using Cohere’s tokenizer. Useful for counting tokens before API calls or analyzing tokenization patterns.

Samples

Tokenize text

SELECT ai.cohere_tokenize(
    'embed-english-v3.0',
    'PostgreSQL is a powerful database'
);
Returns: {5432, 8754, 389, 264, 8147, 4729}

Count tokens

SELECT array_length(ai.cohere_tokenize(
    'embed-english-v3.0',
    'PostgreSQL is a powerful database'
), 1) AS token_count;

Arguments

NameTypeDefaultRequiredDescription
modelTEXT-Cohere model for tokenization
text_inputTEXT-Text to tokenize
api_keyTEXTNULLCohere API key
api_key_nameTEXTNULLName of secret containing the API key
verboseBOOLEANFALSEEnable verbose logging

Returns

INT[]: Array of token IDs.