Week 5: File Handling, JSON, and Why JSONL Is the Format for AI Testing

Reading and writing files, JSON serialization, and the JSONL format that underpins most AI evaluation datasets.

Week 5 of 52 Phase 0: Foundation Status: Complete

Week 5 was practical infrastructure: reading and writing files, JSON serialization, pathlib for modern path handling, and JSONL for large datasets.

Sounds dry. It is not. If you test AI systems, you live in JSONL.

What I Built

dataset-loader - a dataset manager with two components:

dataset_loader.py - reads .json and .jsonl files, provides iteration and filtering
schema_validator.py - validates dataset entries against a defined schema

Sample datasets included for both formats.

The Honest Takeaway

JSONL (JSON Lines) is one object per line. No outer array. Each line is independently parseable.

{"id": "001", "input": "What is 2+2?", "expected": "4", "category": "math"}
{"id": "002", "input": "Summarize this in one sentence.", "expected": "...", "category": "summarization"}

Why does this matter for AI testing? Three reasons:

Streaming. You can read a 100k-record eval dataset line by line without loading it all into memory.
Debugging. When a test fails, you find the exact line. No nested array traversal.
Tooling compatibility. OpenAI fine-tuning, RAGAS, DeepEval, Langfuse - they all use JSONL. Learn it once, use it everywhere.

Building the loader in week 5 meant later, when eval frameworks expected this format, I already understood it from the ground up.

What’s Next

Week 6: Error handling and logging. LLM APIs fail. Building code that handles it gracefully.

View on GitHub | Full Journey