Retrieval-Augmented Learning - Embeddings and Documents

tags:: Large Language Models AI

Augment a large language model by first retrieving related data before letting it execute its instructions.

Best way is to retrieve related documents through embeddings which capture the "meaning" of a string into an n-dimensional vector. You will use a Vector Store to store this data and retrieve documents through maximizing the cosine similarity between the vectors in the database and the vector of the query.

Supabase

With Supabase you can put this into a postgres database using pgvector.

My questions:

So apparently Supabase Vecs automatically splits data into chunks.

They have a GitHub Action embeddings-generator that uploads the Markdown files in a repository into a Supabase database.