# Chapter 6 Lab: Object Storage and Data Lakes with MinIO and S3 APIs

This lab builds a deterministic Bronze-Silver-Gold data lake for TuranMart. The default execution path writes a local folder that mirrors object-storage keys, which makes the lab reliable on any machine. The included Docker Compose file starts MinIO so you can inspect the same layout through an S3-compatible object-storage interface.

## What you will build

| Layer | Output | Purpose |
|---|---|---|
| Bronze | `bronze/commerce/orders/ingest_date=2026-05-30/batch_id=001/` | Raw source files and ingestion manifest. |
| Silver | `silver/commerce/orders_clean/order_date=*/part-0000.parquet` | Valid, typed, product-enriched Parquet rows. |
| Silver rejects | `silver/commerce/orders_rejected/ingest_date=2026-05-30/rejected_orders.jsonl` | Invalid rows with rejection reasons. |
| Gold | `gold/commerce/daily_revenue/` | Daily revenue aggregate in Parquet and CSV form. |

## Quick start

```bash
cd shared/labs/ch06_object_storage_data_lake
python3 -m pip install -r requirements.txt
python3 starter.py --reset --lake-root .lake/turanmart-lake-dev
python3 tests/validate_lab.py --lake-root .lake/turanmart-lake-dev
```

Expected validation output:

```text
BRONZE orders rows: 8
SILVER clean orders rows: 6
SILVER rejected orders rows: 2
GOLD daily revenue rows: 3
GOLD total_revenue: 575.50
VALIDATION PASSED
```

## Optional MinIO inspection

Start MinIO if you want to use the local S3-compatible service.

```bash
docker compose up -d
```

Open `http://localhost:9001` and sign in with username `minioadmin` and password `minioadmin`. Create a bucket named `turanmart-lake-dev`. You can then upload the generated `bronze/`, `silver/`, and `gold/` folders through the console or extend the challenge exercise to upload files with `boto3`.

## Troubleshooting

| Symptom | Cause | Fix |
|---|---|---|
| `ModuleNotFoundError: pyarrow` | Requirements were installed into another Python environment. | Run `python3 -m pip install -r requirements.txt` from this folder. |
| MinIO console is unavailable | Docker is stopped or ports `9000` and `9001` are occupied. | Start Docker or change the Compose port mapping. |
| Validator reports wrong totals | The lake folder contains stale files or edited raw data. | Delete `.lake/` or rerun `python3 starter.py --reset ...`. |
| No Parquet files are created | `pyarrow` is missing or the script failed before Silver writes. | Reinstall requirements and rerun from the lab folder. |

## Cleanup

```bash
docker compose down -v
rm -rf .lake
```
