Home » Reducing AI Hallucinations » Constrain with Knowledge Base

How to Constrain AI Output with a Knowledge Base

Constraining AI output with a knowledge base means restricting the model's generation to information that exists in a verified, maintained collection of facts and documents. Instead of letting the model answer from its parametric memory (training data), you force it to answer from your knowledge base or admit that it cannot answer. This is the most effective single technique for preventing hallucinations in domain-specific AI applications.

Before You Start

The effectiveness of knowledge base constraints depends entirely on the coverage and quality of your knowledge base. A comprehensive, accurate knowledge base produces a system that rarely hallucinates within its domain. A sparse or outdated knowledge base produces a system that either hallucinates to fill gaps or gives unhelpful "I don't know" responses to questions it should be able to answer. Before building constraints, audit your knowledge base coverage against the types of questions your users actually ask. Identify the top 50 most common queries and verify that your knowledge base contains the information needed to answer each one. Gaps in coverage should be filled before constraints are tightened.

Step-by-Step Implementation

Step 1: Build a structured knowledge base with multiple access patterns.
A knowledge base that only supports vector search is limited, because vector search retrieves contextually similar passages but cannot answer precise factual queries directly. Build your knowledge base with at least two access patterns. First, a vector index over your document collection for semantic retrieval of relevant passages. Second, a structured store (knowledge graph, database, or key-value store) for precise fact lookup. The vector index handles "tell me about authentication" queries by retrieving relevant passages. The structured store handles "what authentication method does the project use" queries by returning a specific, verified answer. Both are needed because different question types need different retrieval approaches.

# Multi-source knowledge base structure
class KnowledgeBase:
    def __init__(self):
        self.vector_index = VectorIndex()    # Semantic search
        self.knowledge_graph = Graph()        # Entity facts
        self.memory_store = MemoryStore()     # Verified observations

    def query(self, question, user_id=None):
        # Semantic retrieval for context
        passages = self.vector_index.search(question, top_k=5)

        # Entity lookup for specific facts
        entities = extract_entities(question)
        facts = []
        for entity in entities:
            facts.extend(self.knowledge_graph.get_facts(entity))

        # User-specific memories if available
        memories = []
        if user_id:
            memories = self.memory_store.recall(
                query=question, user_id=user_id, limit=5
            )

        return passages, facts, memories

Step 2: Implement multi-source retrieval that combines context types.
When a user asks a question, query all sources in your knowledge base and combine the results into a unified context block. Passages from vector search provide narrative context and background. Facts from the knowledge graph provide specific, verified data points. Memories from the persistent memory store provide user-specific and project-specific context. Label each section clearly in the context block so the model knows what type of information it is working with and can weight its responses accordingly.

def build_constrained_context(question, user_id):
    passages, facts, memories = kb.query(question, user_id)

    context = ""

    if facts:
        context += "VERIFIED FACTS:\n"
        for f in facts:
            context += f"- {f.subject} {f.predicate} {f.object}\n"
        context += "\n"

    if memories:
        context += "PROJECT CONTEXT:\n"
        for m in memories:
            conf = f"[confidence: {m.confidence:.1f}]"
            context += f"- {conf} {m.content}\n"
        context += "\n"

    if passages:
        context += "REFERENCE DOCUMENTS:\n"
        for p in passages:
            context += f"[{p.source_id}] {p.content}\n\n"

    return context

Step 3: Design generation prompts that enforce knowledge base constraints.
The constraint prompt is where the actual restriction happens. It must clearly communicate four things to the model: use only the provided knowledge base content, do not add information from your training data, cite which part of the knowledge base supports each claim, and explicitly state when the knowledge base does not contain the answer. The prompt should also establish a hierarchy: verified facts take priority over reference documents, which take priority over the model's own knowledge (which should not be used at all for constrained generation).

CONSTRAINT_PROMPT = """You are a knowledge assistant that answers
ONLY from the provided knowledge base. You must never use your
training data to answer factual questions.

Priority order for information:
1. VERIFIED FACTS: Treat these as confirmed. State them directly.
2. PROJECT CONTEXT: Use confidence scores. High confidence (7+)
   can be stated as fact. Lower confidence should be qualified.
3. REFERENCE DOCUMENTS: Use for context and explanation. Cite
   the source identifier when referencing.

Critical rules:
- If the knowledge base has no relevant information, say:
  "This is not covered in the available knowledge base."
- Never invent facts, statistics, names, dates, or specifics
- Never combine a knowledge base fact with your own knowledge
  to create a claim that goes beyond the source
- If asked about something partially covered, answer what you
  can and clearly state what is not covered

{constrained_context}

Question: {question}"""

Step 4: Add entity verification as a post-generation check.
Even with strong constraint prompts, models occasionally introduce entities (names, products, technologies, organizations) that do not appear in the knowledge base. An entity verification step extracts all named entities from the generated response and checks each one against the knowledge graph. Entities that exist in the graph are confirmed. Entities that do not exist in the graph are flagged as potentially fabricated. This check is fast (entity extraction plus graph lookups) and catches one of the most common and most damaging hallucination types: the fabricated proper noun.

def verify_entities(response, knowledge_graph):
    entities = extract_entities(response)
    issues = []

    for entity in entities:
        if not knowledge_graph.entity_exists(entity.text):
            issues.append({
                "entity": entity.text,
                "type": entity.label,
                "status": "not_in_knowledge_base"
            })

    return issues

Step 5: Handle knowledge gaps gracefully without fabrication.
When users ask questions outside the knowledge base's coverage, the constrained system needs to respond helpfully without hallucinating. The worst response is fabrication. The second worst is a terse "I don't know" that provides no value. The best response acknowledges the gap, explains what related information is available, and suggests how the user might find the answer. "I don't have specific information about deployment regions in the knowledge base. The documentation does cover deployment architecture in general (see the Infrastructure section). For region-specific details, the DevOps team's runbook would be the authoritative source." This response is honest, helpful, and directs the user toward the right resource without fabricating the answer.

Tuning Constraint Strictness

Knowledge base constraints exist on a spectrum from loose (prefer the knowledge base but allow supplementation from training data) to strict (answer only from the knowledge base, refuse everything else). The right level depends on your application. A customer-facing support bot handling billing questions should use strict constraints because fabricating a billing policy could create legal liability. A developer tool that answers general programming questions can use looser constraints because the cost of an occasional inaccuracy is low and the benefit of broader coverage is high.

You can also vary constraint strictness by question type within the same application. Questions about specific facts (dates, names, configurations, policies) get strict constraints. Questions about general concepts (how does X work, what is the difference between A and B) get moderate constraints that allow the model to supplement the knowledge base with general technical knowledge. Questions that are clearly creative or speculative (what should we consider for the redesign) get minimal constraints because creativity requires going beyond the existing knowledge base.

Ground your AI in a knowledge base that grows smarter over time. Adaptive Recall provides vector search, knowledge graph, and confidence-scored memories that constrain generation to verified facts.

Get Started Free

How to Constrain AI Output with a Knowledge Base

Before You Start

Step-by-Step Implementation

Tuning Constraint Strictness

Related Articles