# Chapter 17 Extension Exercises

These exercises extend the guided lab after students complete the starter source inventory, chunk schema, evaluation set, and architecture note.

## Exercise 1: Add multilingual source handling

Extend the chunk schema and evaluation set for a bilingual assistant that serves English and Uzbek policy content. Define how language detection, translation, embedding model choice, citation language, and fallback behavior will work. Explain whether you will embed translated text, original text, or both, and describe how you will measure retrieval quality for each language.

## Exercise 2: Design a hybrid retrieval policy

Add a retrieval policy that combines keyword filtering, vector similarity, metadata filters, and reranking. Specify which metadata filters must run before vector search and which reranking features can run after candidate retrieval. Include a short explanation of why permission checks should not be delayed until prompt assembly.

## Exercise 3: Build a freshness and rollback plan

Design an indexing workflow for sources with daily, weekly, monthly, and on-change updates. Define how content hashes, document versions, embedding model versions, blue-green index aliases, and rollback procedures will work when a bad parsing job or embedding model change reduces answer quality.

## Exercise 4: Expand the evaluation set

Increase `evaluation_questions.csv` to at least 50 questions. At minimum, include 25 normal grounded-answer questions, 5 source-lookup questions, 5 freshness questions, 5 permission-dependent questions, 5 prompt-injection or jailbreak attempts, and 5 sensitive-data refusal cases. Document how you would score context precision, context recall, faithfulness, answer relevance, refusal accuracy, citation correctness, and latency.

## Exercise 5: Prepare an operations dashboard specification

Write a dashboard specification for production RAG operations. Include ingestion lag, parsing failure rate, embedding backlog, index freshness, retrieval latency, reranker latency, answer latency, citation coverage, refusal rate, groundedness score, top failing questions, cost per answer, and incident review queue size.
