Module 04

Hallucinations & Mitigations

30 min

Duration: ~30 minutes of video content Timestamps: 1:20:32 - 2:07:28

4.1 Understanding Hallucinations

What Are Hallucinations?

Hallucination: When an LLM generates plausible-sounding but factually incorrect or fabricated information.

User: Who wrote the book "The Azure Sky"?
Model: "The Azure Sky" was written by Jonathan Mitchell in 1987.

Reality: This book doesn't exist. The model invented it.

Root Causes

Cause	Explanation
-------	-------------
Training Objective	Models learn to generate plausible text, not truthful text
Pattern Completion	Statistical patterns don't encode truth
People-Pleasing	Models prefer giving answers over admitting ignorance
Compression Loss	Parameters store approximations, not facts

Key Insight from Karpathy

"Models fabricate information due to their statistical nature. They learn they must always provide answers, even for nonsensical questions."

4.2 Types of Hallucinations

Categorization

┌─────────────────────────────────────────────────────────────┐
│                  HALLUCINATION TAXONOMY                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. FACTUAL HALLUCINATIONS                                   │
│     • Wrong dates, numbers, names                            │
│     • Invented citations or quotes                           │
│     • Non-existent events or people                          │
│                                                              │
│  2. FABRICATED CONTENT                                       │
│     • Made-up books, papers, products                        │
│     • Fictional URLs or sources                              │
│     • Invented code libraries                                │
│                                                              │
│  3. CONFLATION                                               │
│     • Mixing up similar entities                             │
│     • Combining features of different things                 │
│     • Temporal confusion (wrong time periods)                │
│                                                              │
│  4. SELF-HALLUCINATION                                       │
│     • Claiming capabilities it doesn't have                  │
│     • Inventing its own creation story                       │
│     • False claims about training data                       │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Self-Identity Hallucinations

Untuned base models confidently make up their origins:

User: Who created you?
Base Model: I was created by [random company], a team of researchers at [random university]...

Reality: Without hardcoded system prompts, models confabulate their identity.

4.3 Mitigation Strategy 1: Training for Uncertainty

Meta's Factuality Research

A systematic approach to reduce hallucinations:

┌─────────────────────────────────────────────────────────────┐
│              META FACTUALITY PIPELINE                        │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Step 1: Extract training snippets                          │
│          (Get passages from training data)                   │
│                     ↓                                        │
│  Step 2: Generate factual questions                          │
│          (Create Q&A pairs from passages)                    │
│                     ↓                                        │
│  Step 3: Produce model answers                               │
│          (Run model on questions multiple times)             │
│                     ↓                                        │
│  Step 4: Score accuracy                                      │
│          (Compare answers to source passages)                │
│                     ↓                                        │
│  Step 5: Train refusal behavior                              │
│          (Model learns to say "I don't know" when uncertain) │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Teaching "I Don't Know"

Include training examples like:

User: What was the GDP of Atlantis in 2020?
Assistant: I don't have information about this. Atlantis is a mythical
location, so it wouldn't have economic statistics.

User: Who won the 2030 World Cup?
Assistant: I don't have information about events after my training
cutoff date. The 2030 World Cup hasn't occurred yet.

4.4 Mitigation Strategy 2: Tool Integration

The Tool Use Pattern

Instead of answering from parametric memory, models can call external tools:

┌─────────────────────────────────────────────────────────────┐
│                   TOOL-AUGMENTED LLM                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  User: What is Apple's current stock price?                  │
│                                                              │
│  ┌──────────────────┐                                       │
│  │    LLM MODEL     │                                       │
│  └────────┬─────────┘                                       │
│           │ Recognizes need for real-time data               │
│           ▼                                                  │
│  <SEARCH_START>Apple stock price<SEARCH_END>                │
│           │                                                  │
│           ▼                                                  │
│  ┌──────────────────┐                                       │
│  │  SEARCH ENGINE   │ → Returns: "$192.45 as of 2:30 PM"    │
│  └──────────────────┘                                       │
│           │                                                  │
│           ▼                                                  │
│  Model: Based on my search, Apple (AAPL) is currently        │
│  trading at $192.45.                                         │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Available Tools

Tool	Purpose	Reduces Hallucinations For
------	---------	---------------------------
Web Search	Current information	Facts, news, prices
Calculator	Math operations	Arithmetic, formulas
Code Interpreter	Execute code	Complex calculations
Database	Structured queries	Company data, statistics
Knowledge Base	Verified facts	Domain-specific info

Training Tool Use

Models learn tool patterns through examples:

User: What's 17.5% of 4,382?
Assistant: <CALC>4382 * 0.175</CALC>
System: 766.85
Assistant: 17.5% of 4,382 is 766.85.

For KeenDreams: This pattern directly applies - KeenDreams can be a "memory tool" that the LLM learns to query when project context is needed.

4.5 Mitigation Strategy 3: Context Injection

Retrieval-Augmented Generation (RAG)

Provide relevant information in the context window:

┌─────────────────────────────────────────────────────────────┐
│                    RAG PATTERN                               │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. User Query: "What are the project requirements?"         │
│                         ↓                                    │
│  2. Retrieval: Search knowledge base for relevant docs       │
│                         ↓                                    │
│  3. Context Construction:                                    │
│     ┌─────────────────────────────────────────────────────┐ │
│     │ System: You are a helpful assistant. Use the        │ │
│     │ following documents to answer questions.            │ │
│     │                                                     │ │
│     │ [Document 1: Requirements.md content...]            │ │
│     │ [Document 2: Specs.md content...]                   │ │
│     │                                                     │ │
│     │ User: What are the project requirements?            │ │
│     └─────────────────────────────────────────────────────┘ │
│                         ↓                                    │
│  4. Model generates answer from provided context             │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Why Context > Parameters

Aspect	Parametric	Contextual
--------	------------	------------
Accuracy	Approximate	Exact
Updatability	Requires retraining	Instant
Verifiability	Opaque	Traceable
Reliability	Variable	High

Key Insight

"Pasting information directly into context windows produces higher-quality outputs than relying on parametric knowledge."

For KeenDreams: This is the core value proposition - acting as the retrieval layer that surfaces relevant project memories into context, dramatically reducing hallucinations about project state.

4.6 Mitigation Strategy 4: Verification Chains

Multi-Step Verification

Have models check their own work:

User: Summarize this research paper.

Model (Step 1): [Initial summary]

Model (Step 2 - Verification):
Let me verify my summary against the source:
- Claim 1: "The study found X" ✓ (Page 3, paragraph 2)
- Claim 2: "Results showed Y" ✓ (Table 2)
- Claim 3: "The authors concluded Z" ⚠️ (I should reread the conclusion)

Model (Step 3): [Corrected summary with citations]

Cross-Model Verification

Use multiple models to check each other:

# Generate with Model A
response_a = model_a.generate(prompt)

# Verify with Model B
verification = model_b.verify(
    claim=response_a,
    context=original_documents
)

# Return verified response or flag for human review

4.7 Practical Hallucination Detection

Red Flags

Watch for these patterns that often indicate hallucinations:

Pattern	Example	Risk Level
---------	---------	------------
Specific numbers without source	"Studies show 73.2% of..."	High
Named citations	"According to Smith et al. (2019)..."	High
Detailed URLs	"Available at example.com/specific/path"	Very High
Confident edge cases	Obscure historical details	High
Technical specifics	Exact code library versions	Medium

Verification Techniques

Ask for sources: "What source supports this claim?"
Cross-reference: Search for claimed facts independently
Challenge specifics: "Are you certain about that number?"
Request uncertainty: "Rate your confidence 1-10"

4.8 Key Takeaways

Summary

Strategy	Approach	Effectiveness
----------	----------	---------------
Uncertainty Training	Train model to say "I don't know"	Medium
Tool Integration	External search, calculators	High
Context Injection	RAG, document retrieval	Very High
Verification Chains	Self-check, multi-model	High

For AI Analytics Platforms

Critical Monitoring Points:

Hallucination Detection: Flag responses with high-risk patterns
Source Attribution: Track whether responses cite provided context
Confidence Scores: Capture model-expressed uncertainty
Tool Usage: Monitor when models invoke external tools
Verification Loops: Log correction chains

For KeenDreams

Applicable Learnings:

Memory as Grounding: KeenDreams provides factual anchor for project context
Source Tracking: Always include provenance with retrieved memories
Confidence Metadata: Store reliability scores with memories
Verification Integration: Enable models to cross-check against cloud brain

Practice Questions

Why do LLMs hallucinate even when they "know" the correct answer?
How does tool integration reduce hallucinations?
When is parametric knowledge acceptable vs. requiring context?
How would you design a hallucination detection system?

Next Module

→ Module 5: Reinforcement Learning

Timestamps: 1:20:32 - Hallucinations & Tool Use | 1:41:46 - Knowledge of Self | Research on factuality and mitigation strategies