You built an internal knowledge base. Uploaded your docs. Added AI search.
Someone searches for "PTO policy" and gets nothing. You know the document exists.
You check. It is called "Time Off Guidelines." The AI had no idea they meant the same thing.
Your search was smart about meaning OR exact words. Never both.
INTERMEDIATE - Requires embeddings and basic search infrastructure to be in place.
Keyword search finds exact matches. If someone searches "Q4 2024 revenue report," it finds documents with those exact words. Fast, precise, predictable. But completely blind to meaning. "Quarterly earnings summary" returns nothing.
Semantic search understands meaning. It knows "PTO policy" and "time off guidelines" are the same concept. But it can struggle with specifics. Search for "Form W-2" and it might return general tax documents instead of the actual form.
Hybrid search runs both. Keyword search catches exact matches. Semantic search catches meaning matches. The results merge into a single ranked list. When one method fails, the other compensates.
Keyword search is dumb but precise. Semantic search is smart but fuzzy. Together, they catch what neither would find alone.
Hybrid search solves a universal problem: how do you find something when you might describe it differently than how it was labeled?
Use multiple detection methods with different failure modes. When one method misses, another catches it. The combination is more reliable than either alone.
Click a search query. Watch which documents each method finds (or misses).
Each query demonstrates a different blind spot. Try them all.
Finds exact word matches only
Finds similar meaning, different words
K = keyword found, S = semantic found
For "vacation days": Semantic wins: Doc is titled "Time Off Guidelines"
Exact word matching
Counts how often query words appear in documents. Weights rare words higher than common ones. "Form W-2" matches documents containing exactly those words.
Meaning-based matching
Converts query and documents to embeddings. Finds documents with similar meaning even if words differ. "Time off request" matches "PTO policy."
Combining the results
Both searches return scored results. Reciprocal Rank Fusion (RRF) combines them: items that rank high in both get boosted. Items that only one method found still appear.
A team member needs the PTO policy but searches "vacation days." Your document is titled "Time Off Guidelines." Without hybrid search, they find nothing. With this flow, they get the right document in under a second, regardless of how they phrase it.
Hover over any component to see what it does and why it's neededTap any component to see what it does and why it's needed
Animated lines show direct connections · Hover for detailsTap for details · Click to learn more
Your internal docs were written by 12 different people over 5 years. Nobody used consistent terminology. Keyword search finds almost nothing because "vacation time," "PTO," "time off," and "leave" are all different words for the same thing.
Instead: Start with 70% semantic, 30% keyword. Tune based on what your users actually search for.
Someone searches for "error code E-4523" and gets documents about general troubleshooting instead of the specific error. Semantic search understood "error" but missed the exact code that matters.
Instead: Detect when queries contain codes, IDs, or exact phrases. Boost keyword weight for those queries.
You invested in embeddings and vector search. Keyword search feels like a step backward. But users keep reporting missing results for exact names, form numbers, and technical terms.
Instead: Smart and precise are different qualities. You need both. Add keyword search even if it feels redundant.
You know how to combine keyword and semantic search. The natural next step is understanding how to re-order those combined results for even better accuracy.