Semantic Search
Semantic search is a retrieval method that finds results based on the meaning of a query rather than exact keyword matches. Traditional keyword search (also called lexical search) looks for documents containing the same words as the query. Semantic search converts both the query and the documents into vector representations — embeddings — and finds documents whose meaning is closest to the query's meaning, even if they share few or no words in common. The distinction matters in practice. A keyword search for "how to prevent LLM fabrication" would miss a document titled "Strategies for Reducing AI Hallucinations" because the terms do not overlap. A semantic search would recognize that fabrication and hallucination refer to the same concept in this context and surface the relevant result. This makes semantic search particularly valuable for natural-language queries, which is exactly how users interact with AI-powered products. The technology relies on embedding models — neural networks that convert text into high-dimensional vectors where semantically similar texts are positioned near each other. When a user submits a query, it is embedded into the same vector space, and a nearest-neighbor search finds the most similar documents. Modern embedding models like OpenAI's text-embedding-3, Cohere's embed models, and open-source alternatives like BGE and E5 produce embeddings that capture nuanced semantic relationships. Exa is the most prominent web-scale semantic search API, using a proprietary embeddings-based index to find pages that are semantically similar to a query. This is fundamentally different from SERP APIs that query Google's keyword-based index. For RAG pipelines and agent systems, semantic search often produces more relevant retrieval results than keyword search, particularly for complex or ambiguous queries. For product builders, the choice between semantic and keyword search affects retrieval quality, which in turn affects the quality of LLM-generated answers. Many production systems use hybrid approaches — combining keyword and semantic search — to get the benefits of both: semantic search catches conceptual matches while keyword search ensures exact-match precision.