NLP SIMILARITY 2: Use Vector Databases and word embeddings of LLM for semantic similarity search

Learn to use a Large Language Models (LLM) to create word embeddings and store them in a vector database for semantic search on your own data.

7 min readNov 3, 2023

In this article you will learn how you can use the sentence transformer from HuggingFace to create word embeddings on your own data. How to store them in the vector database Chroma and how to realize a semantic search with the help of a vector database.

Embeddings

I’ll not spend to much time in describing what Embeddings are. Because I already wrote that in another article. In short terms word embeddings are numerical representations of words that capture their semantic meaning, typically in the form of dense vectors in a multi-dimensional space.

These vectors are generated using natural language processing techniques and can be used to measure similarities between words. For more details about embeddings go to:

NLP SIMILARITY 2: Use Vector Databases and word embeddings of LLM for semantic similarity search

Learn to use a Large Language Models (LLM) to create word embeddings and store them in a vector database for semantic search on your own data.

Embeddings

Written by Christian Bernecker