Module 10 — Knowledge Architecture & RAG Design

Learning Objectives

Explain what RAG (Retrieval-Augmented Generation) is and why it matters for PMs
Design AI-ready documentation that retrieves accurately
Apply chunking and tagging principles to product knowledge bases
Evaluate when RAG is the right architecture vs. fine-tuning
Build a knowledge base audit checklist for your product

Lesson 10.1 — What Is RAG and Why Should PMs Care?

Retrieval-Augmented Generation (RAG) is the architecture behind most enterprise AI systems that answer questions about internal documents. Rather than relying on training data alone, a RAG system retrieves relevant chunks from a knowledge base and injects them into the prompt before generating a response.

RAG in plain English: Instead of asking an AI to remember everything, you ask it to search your documents first, then answer. The quality of the answer depends almost entirely on the quality of the documents.

RAG vs. Fine-Tuning

Dimension	RAG	Fine-Tuning
When to use	Knowledge changes frequently	Style or behaviour needs to change
Update cost	Low — update documents, not the model	High — retrain required
Transparency	High — can cite sources	Low — knowledge is baked into weights
PM control	Full — PMs own the document layer	Limited — requires ML engineering
Best for	Internal wikis, product specs, support docs	Tone, domain vocabulary, task specialisation

PM insight: You may not build RAG systems yourself, but you own the documents that feed them. Bad documentation = bad AI answers. Your knowledge architecture decisions directly determine AI quality.

How RAG Works — The PM-Relevant Steps

Ingest: Documents are split into chunks and stored with embeddings (vector representations)
Retrieve: User query is converted to an embedding; most similar chunks are retrieved
Augment: Retrieved chunks are injected into the prompt as context
Generate: LLM generates a response grounded in the retrieved chunks

The PM-controllable layer is step 1 — the quality, structure, and completeness of the documents.

Lesson 10.2 — Designing AI-Ready Documentation

AI-ready documentation is structured for retrieval, not just human reading. The principles differ from traditional documentation.

Chunking Principles

Principle	Why it matters	Practice
One idea per chunk	Retrieval returns the most relevant chunk; mixed topics dilute relevance	Each section header = one concept
Self-contained chunks	Retrieved chunks are shown without surrounding context	Each chunk should make sense in isolation
Consistent length	Very short or very long chunks reduce retrieval quality	Aim for 200–600 words per chunk
Descriptive headings	Headings are often included in the embedding signal	Use precise, keyword-rich headings
Metadata tagging	Enables filtered retrieval (e.g., "only search compliance docs")	Add front matter: product, team, date, category

AI-Ready Document Template

---
title: [Precise, descriptive title]
product: [Product name]
category: [spec | policy | guide | reference]
team: [Product | Engineering | Support]
date: YYYY-MM-DD
status: [draft | review | approved | deprecated]
---

## [Clear section heading]
[Self-contained content, 200-600 words, one concept only]

## [Next concept heading]
[Self-contained content...]

Common AI-Readiness Failures

Implicit knowledge: "As discussed in Q3 planning..." — the AI has no access to that meeting
Nested conditionals: Long if/then/else structures become ambiguous when chunked
Acronyms without definition: First use must spell it out, even in internal docs
Version stacking: Old and new versions of the same policy in the same file create contradictions
Image-only content: Diagrams with no text description are invisible to RAG

Archive, don't delete: Deprecated documents should be clearly marked status: deprecated and moved to an archive folder — not deleted. Deletion breaks retrieval for historical queries.

Lesson 10.3 — Knowledge Base Governance

A knowledge base without governance degrades fast. As a PM, you are often responsible for the product knowledge that engineering, support, and AI systems rely on.

Knowledge Base Health Metrics

Metric	What to measure	Target
Freshness	% of docs updated in last 90 days	>80%
Coverage	% of product features with a spec	>90%
Orphan rate	% of docs with no owner assigned	<10%
Deprecation hygiene	% of deprecated docs properly tagged	100%
Retrieval accuracy	% of AI answers citing the correct doc	>85%

Lab 10 — Knowledge Base Audit & AI-Readiness Rewrite

You will audit a set of FlowScale documents for AI-readiness and rewrite one to meet the standard.

Audit an existing document

Take any product document you have (a spec, policy, or wiki page). Paste it into your AI agent with this prompt:

Prompt: "Review this document for AI-readiness in a RAG system. Flag any issues with: chunk boundaries (mixed topics), self-containedness (chunks that assume external context), implicit knowledge (references to meetings or decisions not in the document), acronyms without definitions, version stacking, and missing metadata. Provide a score from 1–10 and specific fixes for each issue found."

Rewrite for AI-readiness

Take the lowest-scoring section from your audit and rewrite it using the AI-ready document template from Lesson 10.2. Use this prompt to help:

Prompt: "Rewrite this section to be AI-ready for RAG retrieval. Apply these rules: one concept per section, self-contained (no external references), 200–400 words, descriptive heading, all acronyms defined on first use, add YAML front matter with title, product, category, team, date, and status fields."

Create a knowledge base health snapshot

Using the metrics table from Lesson 10.3, create a simple health snapshot for your product's knowledge base (or use FlowScale as a hypothetical). For each metric, estimate the current value and identify the single biggest improvement action.

Deliverables

Audit report: scored document with flagged issues and recommended fixes
Rewritten section: AI-ready version with YAML front matter and chunked structure
Health snapshot: 5-metric table with current estimate and top action per metric

How to Verify Completion

Audit report identifies at least 3 specific AI-readiness issues with actionable fixes
Rewritten section has valid YAML front matter with all 6 fields populated
Rewritten section contains no external references ("as discussed", "per Q3 planning", etc.)
Each section in the rewrite covers exactly one concept and is 200–400 words
Health snapshot covers all 5 metrics with realistic current estimates

Done when: You paste your rewritten section into a fresh AI chat, ask it a question about that section's topic, and receive an accurate, source-citable answer without the AI needing any additional context.

Module 10 Quiz

6 questions · click an option to answer · review all before checking your score

1. What does RAG stand for and what is its core advantage over fine-tuning for PMs?

RAG = Retrieval-Augmented Generation. Its key PM advantage is that you update the underlying documents (which PMs control) rather than retraining the model (which requires ML engineering). Low update cost and high PM ownership.

2. What is the recommended word count per document chunk for optimal RAG retrieval?

200–600 words is the practical sweet spot. Very short chunks lack enough context for accurate answers; very long chunks dilute the relevance signal and reduce retrieval precision.

3. Which of these is an example of an AI-readiness failure in documentation?

Implicit knowledge — references to meetings, decisions, or context not included in the document — is one of the most common AI-readiness failures. The AI has no access to that context when the chunk is retrieved.

4. Why should deprecated documents be tagged rather than deleted?

Deletion breaks retrieval for historical queries. If someone asks "what was our refund policy in 2023?", a deleted policy document means an AI system cannot answer. Tagging with status: deprecated and archiving is the correct approach.

5. What does the PM-controllable layer in a RAG pipeline primarily consist of?

PMs own the document layer. The quality of your product specs, support docs, and policies directly determines the quality of every AI answer generated from them. This is the highest-leverage point of PM influence in a RAG system.

6. When is fine-tuning preferable to RAG?

Fine-tuning changes model behaviour — how it writes, what vocabulary it uses, how it approaches tasks. RAG changes what the model knows. If you need a consistent domain tone or specialised task format, fine-tuning is the right choice. For current product knowledge, use RAG.

Score: 0 / 6