Saltar a contenido

Week 7 - Dataclasses, attrs, Pydantic, and the Validation Boundary

7.1 Conceptual Core

  • The single most important architectural decision in a typed Python codebase: where is the validation boundary? Internal types should be cheap (@dataclass(slots=True, frozen=True)); boundary types (HTTP request bodies, message-bus payloads, LLM outputs) should validate (pydantic.BaseModel).
  • "Parse, don't validate." Once a value is past the boundary, it should be a typed object that cannot be malformed; checks afterward are dead code.

7.2 Mechanical Detail

  • dataclasses.dataclass parameters: frozen, slots, kw_only, eq, order, repr, match_args. Defaults that should be field(default_factory=list) - never bare [].
  • attrs (the original): faster validators, evolve, slots by default. Still relevant; dataclass won the stdlib slot but attrs keeps innovating.
  • pydantic v2 (Rust core, ~10x faster than v1): BaseModel, Field(..., gt=0, le=100), model_validator, field_validator, discriminated unions, Annotated[..., AfterValidator(...)]. JSON schema export for free.
  • TypedDict (PEP 589): for dict-shaped data with known keys (e.g., LLM tool-call payloads). Cheaper than Pydantic, no runtime validation. Pair with cast at the boundary or with pydantic.TypeAdapter.

7.3 Lab - "The Three-Layer Cake"

  1. Build an HTTP service (FastAPI, but kept small):
  2. Boundary layer: Pydantic RequestModel / ResponseModel.
  3. Domain layer: @dataclass(slots=True, frozen=True) value objects.
  4. Persistence layer: TypedDict rows from sqlite3.
  5. Write explicit converters between each layer. Resist the urge to make them the same type.
  6. Benchmark a 10k-request loop with Pydantic v1 (if installed) vs. v2. Document the 10x.

7.4 Idiomatic & Linter Drill

  • Enable ruff D (pydocstyle). Document every public class and function. Enforce Google or NumPy docstring style.

7.5 Production Hardening Slice

  • Add schemathesis or property-based tests against your FastAPI app. Generate inputs from the OpenAPI schema; confirm 5xx never occurs on valid input shapes.

Comments