# Chapter 3 Lab: Open-Source Tool Evaluation and Project Template

This lab supports Chapter 3 of **Data Engineering in Action**. It is the **Part 1 mini-capstone**: instead of installing a heavy platform immediately, you will make a disciplined tool choice, record the decision, and create a reusable project skeleton for the TuranMart data platform.

## Learning Goals

By completing this lab, you will practice converting business requirements into evaluation criteria, scoring open-source tools with explicit evidence, writing an architecture decision record, and preparing a project template that later chapters can extend with databases, object storage, transformations, orchestration, tests, and observability.

## Files

| Path | Purpose |
|---|---|
| `tool_evaluation_matrix.csv` | Starter weighted scorecard for comparing Airflow, Dagster, and Prefect as orchestration candidates. |
| `architecture_decision_record.md` | Starter ADR documenting the initial orchestration decision and review trigger. |
| `project_template_tree.txt` | Target project folder structure for the first TuranMart open-source data platform template. |
| `tests/validate_scores.py` | Lightweight validation script that computes weighted scores and checks the matrix shape. |
| `exercises/README.md` | Student exercises that extend the guided lab into a stronger evidence-based selection process. |
| `../../../shared/solutions/ch03_open_source_ecosystem/solution.md` | Instructor/reference solution guide. |

## Quick Start

From the repository root, run:

```bash
python shared/labs/ch03_open_source_ecosystem/tests/validate_scores.py
```

The script prints weighted scores for each candidate and warns if required criteria are missing. The lab does not require Docker Compose because the goal is a decision artifact and project template rather than a running service.

## Suggested Workflow

Begin by reading the TuranMart scenario in Chapter 3. Then open `tool_evaluation_matrix.csv` and adjust one row at a time. Every score should be backed by evidence: documentation, a release note, a minimal local test, a security policy, a license record, a reference deployment, or a short proof-of-concept result. After the scores are calculated, update `architecture_decision_record.md` so that the decision, alternatives, consequences, and review date match your evidence.

Finally, review `project_template_tree.txt` and create the same structure in a temporary folder on your machine. Do not add every possible tool. The purpose of the template is to create a stable starting point that can grow naturally as the book introduces PostgreSQL, object storage, batch processing, streaming, transformations, orchestration, and observability.

## Completion Checklist

| Check | Expected result |
|---|---|
| Matrix validates | `validate_scores.py` prints scores for all candidates and exits successfully. |
| Evidence is explicit | Each criterion has a practical evidence requirement rather than an opinion-only score. |
| ADR is complete | Context, decision, alternatives, consequences, and review date are present. |
| Template is reproducible | The folder tree can be recreated by another reader without extra explanation. |
| Tool choice is revisitable | The ADR names at least one assumption that could change the decision later. |
