# Chapter 6 Solution Guide: Object Storage and Data Lakes

The reference implementation builds a small but realistic data lake layout. It keeps Bronze data source-native, validates and enriches records in Silver, and publishes a compact Gold aggregate for business consumption.

## Design explanation

| Design decision | Explanation |
|---|---|
| Local folder as default target | The folder mirrors object-storage keys, so the lab remains deterministic even when Docker or cloud credentials are unavailable. |
| MinIO as optional inspection layer | MinIO teaches S3-compatible bucket and object concepts without requiring a managed cloud account. |
| Bronze immutability | Raw `orders.jsonl`, `products.csv`, and `manifest.json` are copied into the Bronze prefix without modifying source evidence. |
| Silver Parquet | Valid rows are typed, enriched with product attributes, and written as Parquet partitioned by `order_date`. |
| Rejection file | Invalid rows are preserved with rejection reasons instead of disappearing from the audit trail. |
| Gold aggregate | `daily_revenue` demonstrates the transition from row-level evidence to BI-ready business metrics. |

## Expected results

| Metric | Expected value |
|---|---:|
| Bronze order rows | 8 |
| Silver clean order rows | 6 |
| Silver rejected order rows | 2 |
| Gold daily revenue rows | 3 |
| Gold total revenue | 575.50 |

The two rejected rows are `O-1007`, because it references an unknown product, and `O-1008`, because it has an invalid timestamp and a non-positive quantity. A production implementation would also record pipeline run IDs, source offsets, source checksums, and data contract versions.

## Extension guidance

For the challenge exercise, keep the local writer as the source of truth and add a separate upload function that walks the lake root and uploads each file to MinIO with the relative path as the object key. This preserves deterministic validation while letting readers practice S3-compatible operations.
