pgvector

pgvector is an open-source extension to PostgreSQL® designed to add support for storing, indexing vectors and searching vector similarity.

Vectors are really useful in multiple fields, among them semantic search, similarity search, machine learning, multimedia data handling and Natural Language Processing (NLP).

The extension supports the following features:

  • Storing vectors of multiple sizes
  • Indexing vectors for efficient comparaison using multiple algorithms
  • Searching exact and approximate nearest neighbor
  • Single-precision, half-precision, binary, and sparse vectors
  • L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance

More info on the official documentation

Enabling pgvector

To enable pgvector:

  1. Provision a new PostgreSQL® database
  2. Access your database using the Interactive Remote Console
  3. Enable the vector extension using the following commands:

    From the PostgreSQL® console, run the following command:

     CREATE EXTENSION IF NOT EXISTS vector;
    

    The output should look like this:

     CREATE EXTENSION
     my_app_4553=>
    
  4. pgvector is now enabled and ready to be used!

Example: Storing Vector Embeddings

More and more company around generative Artificial Intelligence are proposing vector embeddings models to compute vector embeddings for any type of content.

These vectors allow the comparaison of arbitrary contents based on vector arithmetics. Thus, a common usage of pgvector is to add a vector column in your data table, and fill it with generated embeddings.

-- The size of the vector should be modified according to the model you are using
ALTER TABLE items ADD COLUMN embedding vector(3);

Then you can easily make a request in order to find rows which are similar to a given input. You’ll need to generate the embedding of the input and compare it with your data. In the following example, get the 5 “closest” rows:

-- vector_data has to be set, either by the embedding of another row, or by an
-- externally computed embedding
SELECT * FROM items ORDER BY embedding <-> '[vector_data]' LIMIT 5;

Example: have a look at OpenAI Embeddings Documentation


Suggest edits

pgvector

©2025 Scalingo