Grounding
Grounding is the practice of anchoring a language model's output to verifiable external sources. An ungrounded model generates text based on statistical patterns learned during training, which can produce fluent but factually incorrect responses — hallucinations. A grounded model, by contrast, has access to retrieved documents, search results, or structured data that it can reference when forming its answer. The term is borrowed from cognitive science, where grounding refers to connecting abstract symbols to real-world referents. In the LLM context, it means connecting the model's generation process to actual data. Google, OpenAI, and Anthropic all use variations of grounding in their products: Google's Gemini can be grounded with Google Search results, OpenAI's ChatGPT uses web browsing for grounding, and various API providers offer grounding as a feature for enterprise deployments. For product builders, grounding is not a single technique but a design goal that can be achieved through several mechanisms. RAG is the most common: retrieve relevant documents and include them in the prompt. Search-augmented generation is a variant where the retrieval step queries a search engine rather than a static corpus. Tool use is another approach, where the model calls external APIs to look up specific facts. Citation generation — having the model output source URLs alongside its claims — is a complementary technique that makes grounding auditable. The business case for grounding is straightforward. If your AI product gives wrong answers, users stop trusting it. In regulated industries — healthcare, finance, legal — ungrounded outputs can create liability. Even in less regulated contexts, a product that confidently states outdated pricing, misattributes quotes, or invents statistics will lose credibility quickly. Grounding does not eliminate all errors, but it dramatically reduces the rate of fabricated claims and gives users a way to verify what the model says. Web data tools are central to grounding strategies. SERP APIs, AI search APIs, and web scraping infrastructure provide the external evidence that models need to stay accurate and current.