Who this path serves

Teams where answers must be grounded in internal documents, policies, or knowledge bases that change frequently. The hardest bugs are often retrieval and freshness—not raw fluency.

Step 1 — RAG in depth

Begin with RAG experiment hub and RAG in production. Use retrieval theme as a checklist for chunking and routing.

Step 2 — Evaluation with retrieval in mind

Proceed to evaluation hub and Beyond accuracy, focusing on suites that include empty results, stale citations, and ambiguous queries. Cross-reference safety for policy-bound corpora.

Step 3 — Prompt interfaces for citations

Finish with prompt experiment / interface essay so citation rules are encoded where the model cannot bypass them casually.

Related

Operations for pipelines; Path C when you need a rollout checklist across all layers.

Why retrieval before generic eval depth

If your users live in documents, fixing chunking and freshness removes whole classes of bugs that no amount of prompt tuning will address. Evaluation suites that ignore empty retrieval or stale citations will green-light broken experiences—so this path front-loads RAG reality.

Adjacent paths

Path A builds prompt and eval foundations without assuming doc-heavy traffic. Reading paths overview lists every sequence.

Corpus ownership

Clarify who approves document updates, how legal holds or takedowns affect embeddings, and how deletions propagate to the index. Many RAG failures trace to process gaps—an updated PDF that never reached the pipeline—rather than vector dimension or model choice.