Chunking
Chunking na process of splitting source documents into smaller retrieval units before embedding dem. Di chunk size and boundary strategy determine how precisely retriever fit locate relevant fact, balancing recall, precision, and embedding cost across knowledge base.
Synonym dem: text chunking, document segmentation, passage splitting, chunk strategy
Chunking na where retrieval quality quietly win or lose. Strategy fit be fixed token window, overlapping sliding window, or boundaries wey follow semantic structure like headings and sections. Each chunk dey embedded and indexed with metadata: source, language, timestamps, content hash, so retrieval fit filter, deduplicate, and refresh incrementally. Because every downstream answer only good as di passage wey e retrieve, deliberate chunking na prerequisite for grounded, citable responses.