跳至內容
術語表

向量搜尋

Vector search 係按意思而唔係 exact words 搵 content。Text 會轉成 high-dimensional embeddings,而 cosine distance 呢類 similarity metric 會按 stored vectors 同 query vector 嘅接近程度排序,即使冇 keyword match 都可以回傳 conceptually related passages。

同義詞:semantic search, similarity search, nearest-neighbor search, embedding search

Vector search 支撐 semantic retrieval:佢唔係 match strings,而係 match meaning。Query 會 embedded 到同 indexed content 一樣嘅 vector space,index 就按 distance metric 回傳最近嘅 vectors。為咗喺 scale 下保持快,production systems 會用 approximate nearest-neighbor indexes,接受極小 accuracy trade-offs 換取大幅 latency wins。Vector search 同 keyword search 配成 hybrid retriever 時最有效,因為 exact identifiers 唔會喺純 semantic matching 入面流失。

常見問題

vector search 入面嘅 embedding 係咩?
Embedding 係由 embedding model 產生、代表一段 text 意思嘅 numeric vector。意思相近嘅 texts 會落喺 vector space 入面相近位置。
approximate nearest neighbor (ANN) search 係咩?
ANN search 用少量 accuracy 換取大幅 speed gains,透過 index structures 令 stored vectors 增長到數百萬時 similarity lookups 仍然保持快速。