Module 1055 min

Knowledge Architecture & RAG Design

Structure product knowledge so AI can find, retrieve, and use it reliably.

Learning Objectives

Lesson 10.1 — What Is RAG and Why Should PMs Care?

Retrieval-Augmented Generation (RAG) is the architecture behind most enterprise AI systems that answer questions about internal documents. Rather than relying on training data alone, a RAG system retrieves relevant chunks from a knowledge base and injects them into the prompt before generating a response.

RAG in plain English: Instead of asking an AI to remember everything, you ask it to search your documents first, then answer. The quality of the answer depends almost entirely on the quality of the documents.

RAG vs. Fine-Tuning

DimensionRAGFine-Tuning
When to useKnowledge changes frequentlyStyle or behaviour needs to change
Update costLow — update documents, not the modelHigh — retrain required
TransparencyHigh — can cite sourcesLow — knowledge is baked into weights
PM controlFull — PMs own the document layerLimited — requires ML engineering
Best forInternal wikis, product specs, support docsTone, domain vocabulary, task specialisation
PM insight: You may not build RAG systems yourself, but you own the documents that feed them. Bad documentation = bad AI answers. Your knowledge architecture decisions directly determine AI quality.

How RAG Works — The PM-Relevant Steps

  1. Ingest: Documents are split into chunks and stored with embeddings (vector representations)
  2. Retrieve: User query is converted to an embedding; most similar chunks are retrieved
  3. Augment: Retrieved chunks are injected into the prompt as context
  4. Generate: LLM generates a response grounded in the retrieved chunks

The PM-controllable layer is step 1 — the quality, structure, and completeness of the documents.

Lesson 10.2 — Designing AI-Ready Documentation

AI-ready documentation is structured for retrieval, not just human reading. The principles differ from traditional documentation.

Chunking Principles

PrincipleWhy it mattersPractice
One idea per chunkRetrieval returns the most relevant chunk; mixed topics dilute relevanceEach section header = one concept
Self-contained chunksRetrieved chunks are shown without surrounding contextEach chunk should make sense in isolation
Consistent lengthVery short or very long chunks reduce retrieval qualityAim for 200–600 words per chunk
Descriptive headingsHeadings are often included in the embedding signalUse precise, keyword-rich headings
Metadata taggingEnables filtered retrieval (e.g., "only search compliance docs")Add front matter: product, team, date, category

AI-Ready Document Template

---
title: [Precise, descriptive title]
product: [Product name]
category: [spec | policy | guide | reference]
team: [Product | Engineering | Support]
date: YYYY-MM-DD
status: [draft | review | approved | deprecated]
---

## [Clear section heading]
[Self-contained content, 200-600 words, one concept only]

## [Next concept heading]
[Self-contained content...]

Common AI-Readiness Failures

Archive, don't delete: Deprecated documents should be clearly marked status: deprecated and moved to an archive folder — not deleted. Deletion breaks retrieval for historical queries.

Lesson 10.3 — Knowledge Base Governance

A knowledge base without governance degrades fast. As a PM, you are often responsible for the product knowledge that engineering, support, and AI systems rely on.

Knowledge Base Health Metrics

MetricWhat to measureTarget
Freshness% of docs updated in last 90 days>80%
Coverage% of product features with a spec>90%
Orphan rate% of docs with no owner assigned<10%
Deprecation hygiene% of deprecated docs properly tagged100%
Retrieval accuracy% of AI answers citing the correct doc>85%

Lab 10 — Knowledge Base Audit & AI-Readiness Rewrite

You will audit a set of FlowScale documents for AI-readiness and rewrite one to meet the standard.

1
Audit an existing document

Take any product document you have (a spec, policy, or wiki page). Paste it into your AI agent with this prompt:

Prompt: "Review this document for AI-readiness in a RAG system. Flag any issues with: chunk boundaries (mixed topics), self-containedness (chunks that assume external context), implicit knowledge (references to meetings or decisions not in the document), acronyms without definitions, version stacking, and missing metadata. Provide a score from 1–10 and specific fixes for each issue found."
2
Rewrite for AI-readiness

Take the lowest-scoring section from your audit and rewrite it using the AI-ready document template from Lesson 10.2. Use this prompt to help:

Prompt: "Rewrite this section to be AI-ready for RAG retrieval. Apply these rules: one concept per section, self-contained (no external references), 200–400 words, descriptive heading, all acronyms defined on first use, add YAML front matter with title, product, category, team, date, and status fields."
3
Create a knowledge base health snapshot

Using the metrics table from Lesson 10.3, create a simple health snapshot for your product's knowledge base (or use FlowScale as a hypothetical). For each metric, estimate the current value and identify the single biggest improvement action.

Deliverables

  • Audit report: scored document with flagged issues and recommended fixes
  • Rewritten section: AI-ready version with YAML front matter and chunked structure
  • Health snapshot: 5-metric table with current estimate and top action per metric

How to Verify Completion

  • Audit report identifies at least 3 specific AI-readiness issues with actionable fixes
  • Rewritten section has valid YAML front matter with all 6 fields populated
  • Rewritten section contains no external references ("as discussed", "per Q3 planning", etc.)
  • Each section in the rewrite covers exactly one concept and is 200–400 words
  • Health snapshot covers all 5 metrics with realistic current estimates
Done when: You paste your rewritten section into a fresh AI chat, ask it a question about that section's topic, and receive an accurate, source-citable answer without the AI needing any additional context.

Module 10 Quiz

6 questions · click an option to answer · review all before checking your score

1. What does RAG stand for and what is its core advantage over fine-tuning for PMs?

RAG = Retrieval-Augmented Generation. Its key PM advantage is that you update the underlying documents (which PMs control) rather than retraining the model (which requires ML engineering). Low update cost and high PM ownership.

2. What is the recommended word count per document chunk for optimal RAG retrieval?

200–600 words is the practical sweet spot. Very short chunks lack enough context for accurate answers; very long chunks dilute the relevance signal and reduce retrieval precision.

3. Which of these is an example of an AI-readiness failure in documentation?

Implicit knowledge — references to meetings, decisions, or context not included in the document — is one of the most common AI-readiness failures. The AI has no access to that context when the chunk is retrieved.

4. Why should deprecated documents be tagged rather than deleted?

Deletion breaks retrieval for historical queries. If someone asks "what was our refund policy in 2023?", a deleted policy document means an AI system cannot answer. Tagging with status: deprecated and archiving is the correct approach.

5. What does the PM-controllable layer in a RAG pipeline primarily consist of?

PMs own the document layer. The quality of your product specs, support docs, and policies directly determines the quality of every AI answer generated from them. This is the highest-leverage point of PM influence in a RAG system.

6. When is fine-tuning preferable to RAG?

Fine-tuning changes model behaviour — how it writes, what vocabulary it uses, how it approaches tasks. RAG changes what the model knows. If you need a consistent domain tone or specialised task format, fine-tuning is the right choice. For current product knowledge, use RAG.
Score: 0 / 6
← Module 9: PM OS Module 11: AI Support Systems →