Jump go di content
Glossary

Embedding

Embedding na numeric vector wey represent meaning of text, images, or other data inside high-dimensional space. Items with similar meaning produce vectors wey sit close together, so systems fit compare, cluster, and retrieve content by semantic similarity instead of exact matches.

Synonym dem: vector embedding, text embedding, semantic vector, dense representation

Embeddings na bridge between human language and similarity math. Embedding model map each input to fixed-length vector so semantically related items cluster together, enabling vector search, clustering, classification, and deduplication. For retrieval pipeline, both indexed chunks and incoming query dey embedded with di same model so distances get meaning. Because embedding model define di space, im version na metadata worth tracking for reproducibility and controlled reindexing.

Question dem wey people dey ask well-well

Why embedding model version matter?
Vectors from different models no comparable. Storing model version with each embedding let you detect drift and reindex safely when you upgrade embedding model.
Embeddings fit reverse back to original text?
No exactly, but embeddings fit leak sensitive information, so dem suppose inherit di same tenant isolation and access controls as source content wey dem represent.