Skip to content

00 - Introduction

What this session is

A 10-minute read. No code. Sets expectations.

What you're going to be able to do, eventually

By the end: - Manipulate tensors confidently with PyTorch. - Build, train, and use a small neural network from scratch. - Load a pre-trained transformer from Hugging Face and use it. - Fine-tune that transformer on your own data (parameter-efficient with LoRA). - Build a small Retrieval-Augmented Generation (RAG) app. - Evaluate model quality the right way (most people get this wrong). - Serve a model behind an HTTP API. - Clone an AI OSS project, find a small fix, submit a PR.

The deal

  • It's slow on purpose. One concept per page.
  • Python fluency assumed. Read a stack trace, write a function, walk a list.
  • No math PhD required. Linear algebra at the "dot product and matmul" level. We explain everything else inline.
  • GPU is helpful but not mandatory. Pages 01-08 work on CPU. Page 09+ benefits from GPU; Google Colab's free tier suffices.
  • You will be confused. Often. AI has more vocabulary than any other technical area on this site. Don't panic.

A note on hype vs honesty

The AI field has more hype than any other in software. To stay sane:

  • Models are token predictors. They are not "intelligent" in the way the marketing implies. They are very good at pattern completion over enormous corpora. That's an extraordinary thing - and that's all it is.
  • Most "AI products" are wrappers around APIs. The actual engineering: tokenization, retrieval, prompt design, evaluation. The "model" itself is often someone else's pre-trained checkpoint.
  • Evaluation is the hard part. "Looks good" is not evaluation. We'll do this properly in page 11.

This path treats AI as a practical engineering domain - what works, how it's built, how to ship it. We don't speculate about AGI.

What you need

  • A computer (any OS).
  • Python ≥3.10 (set up in Python From Scratch path).
  • A text editor.
  • ~5 hours/week. Path is sized for 4-6 months.
  • A GPU for pages 09+ (or use Google Colab / Kaggle for free).

What you do NOT need

  • A PhD or MS.
  • A formal math background beyond high school algebra + intuitive linear algebra (we cover what you need).
  • A cloud account or paid API. Open-source models run locally; we use them.
  • C++ / CUDA. Those are senior-path material (AI Systems senior reference).

How long this realistically takes

4-6 months at 5 hours/week to "submit a PR."

The slowest pages are 07 (transformers) and 09 (fine-tuning). Plan for one or two re-reads at each.

What success looks like

You'll be able to: - Look at a model.py in any HF model and roughly understand what it does. - Build a small project end-to-end: load data, train, evaluate, serve. - Read a research paper's abstract + introduction + experiments section and predict what their code does. - Submit a fix to a real AI OSS project.

You will not be able to: - Train a frontier LLM. (Multi-million-dollar GPU farms; not in 6 months.) - Tell people you're "an ML engineer." (Years of work past this.) - Pass an FAANG ML interview. (Different focus - leetcode plus theory.)

What you'll have: the foundation to keep going. The AI Expert Roadmap is the natural follow-up - 12 months of structured study from here.

One last thing before we start

If a page feels too dense - stop, re-read. Still dense? Skip, come back.

The AI field uses jargon shamelessly. When a word appears you haven't seen, this path defines it inline. If a word slips through without definition, that's a bug - note it.

Ready? Next: Setup →

Comments