Skip to content

14 - Anatomy of a Small Python OSS Repo

What this session is

About 45 minutes. We'll walk through the file layout of a real (small) Python open-source project, file by file, so you know what every common piece is for. The next page asks you to make a contribution; this page makes the project feel less like a maze.

We'll use the modern Python project layout as our template. There's no single official spec, but the conventions are stable enough that you can predict where things live.

A typical small Python project, from the top

After git clone and cd into it, you'll usually see something like:

.
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── CODE_OF_CONDUCT.md
├── pyproject.toml
├── .gitignore
├── .pre-commit-config.yaml
├── .github/
│   ├── workflows/         (GitHub Actions CI files)
│   ├── ISSUE_TEMPLATE/
│   └── PULL_REQUEST_TEMPLATE.md
├── src/
│   └── mypackage/
│       ├── __init__.py
│       ├── core.py
│       └── cli.py
├── tests/
│   ├── conftest.py
│   ├── test_core.py
│   └── test_cli.py
├── docs/
│   ├── conf.py
│   ├── index.rst              (or .md if using MyST)
│   └── ...
├── examples/
│   └── basic.py
└── tox.ini                    (older projects)

Not every project has all of these. The shape varies, but the roles are consistent.

What each piece is for

Root-level files

  • README.md - the project's homepage. Should give you: one-line description, install instructions, smallest working example. If the README isn't useful, the project is incomplete.

  • LICENSE - legal terms (MIT, Apache 2.0, BSD, GPL). Know the license before contributing. Some projects (Apache Foundation, CNCF) require signing a CLA (Contributor License Agreement) - the bot will prompt you on your first PR.

  • CONTRIBUTING.md - the most important file for you right now. Spells out how to propose changes, conventions, branch naming, commit message style, how tests should look. Read it before doing anything.

  • CODE_OF_CONDUCT.md - community standards. Usually the Contributor Covenant. "Be respectful, no harassment" is the gist.

  • pyproject.toml - project metadata, dependencies, build config, tool config (ruff, mypy, pytest, coverage). Modern projects put nearly all configuration here.

  • setup.py / setup.cfg - older alternatives to pyproject.toml. You'll see them in projects predating ~2021.

  • requirements.txt / requirements-dev.txt - pinned dependencies (often for applications, less for libraries). Some projects have both pyproject.toml and requirements.txt.

  • tox.ini - config for tox, an older multi-env test runner. Increasingly replaced by nox (Python-based config). If a project uses one, tox -e py311 runs the test suite against Python 3.11.

  • .gitignore - files git should ignore.

  • .pre-commit-config.yaml - config for pre-commit, a tool that runs linters/formatters before each commit. If the project uses it, install with pre-commit install after cloning - it'll catch style issues automatically.

.github/

GitHub-specific configuration:

  • workflows/ - CI pipelines (YAML files). One file per workflow. Reading them tells you what the project considers "the green path" - the exact commands your PR will be measured against.

  • ISSUE_TEMPLATE/ - templates for issue types.

  • PULL_REQUEST_TEMPLATE.md - what GitHub pre-fills the PR description with. Address every checkbox.

  • CODEOWNERS - who automatically reviews PRs touching a file.

src/<package>/ (or <package>/ at top level)

The actual code. The src/ layout (vs top-level) is the modern best practice - it forces you to install the package to use it, which catches a class of "works on my machine but breaks in CI" bugs.

Inside, every folder with .py files needs __init__.py to be a package (though modern Python allows "namespace packages" without it).

A __init__.py often: - Re-exports the public API: from .core import MainClass, main_function. - Sets __version__ = "1.2.3" for runtime version access. - Sometimes is empty (when the package is just a folder grouping).

tests/

Tests, mirroring the source layout. test_*.py files; conftest.py for shared fixtures (page 10).

Common shape:

tests/
├── conftest.py            # shared fixtures, available to all tests below
├── test_core.py           # tests for src/mypackage/core.py
├── test_cli.py
└── integration/
    └── test_end_to_end.py

Tests are usually run from the project root with pytest.

docs/

Documentation source. Common Python tools: - Sphinx - the original. Files in .rst (reStructuredText) or .md (with the MyST extension). Generates HTML, PDF, ePub. - MkDocs (often with the Material theme) - Markdown-only, simpler. The platform you're reading is built with this.

The hosted docs are usually on Read the Docs (free for OSS) or GitHub Pages.

examples/

Runnable example code showing how to use the project. Read these - they're the "official" way to use the API. Often the fastest way to understand a library.

Makefile or noxfile.py or tox.ini

A script of common dev commands. Open it and read the targets: - make test or nox -s test - run tests. - make lint - run linters. - make docs - build docs locally. - make format - auto-format with black/ruff.

These commands often pass project-specific flags you'd get wrong from memory. Use them.

Common tools you'll meet

Modern Python projects use a stack of tooling. Recognize the names:

  • ruff (Rust-implemented) - linter and formatter, ~100× faster than the old options. Replaces flake8, isort, sometimes black. Increasingly the default since ~2024.
  • black - opinionated formatter. Older but still widely used.
  • mypy or pyright - static type checkers. Run them to catch type bugs without running the code.
  • pytest - the test framework (page 10).
  • coverage / pytest-cov - measure test coverage.
  • pre-commit - runs these on every commit.

When the CI workflow runs ruff check && mypy && pytest, that's what your PR will be measured against. Run them locally first.

A worked walkthrough: hynek/structlog

Let's apply the above to a real project: hynek/structlog, a structured logging library. Clone it:

git clone https://github.com/hynek/structlog ~/code/structlog
cd ~/code/structlog
ls

You should see roughly:

README.md  LICENSE  CHANGELOG.md  CONTRIBUTING.md
pyproject.toml  tox.ini
src/structlog/
tests/
docs/
.github/

Apply what you just learned:

  1. README.md - read it. What does structlog do? (Structured logging for Python.)
  2. pyproject.toml - package name? (structlog.) Dependencies? (Almost none - quality signal.)
  3. src/structlog/ - the code. Open __init__.py. Note what's re-exported - that's the public API.
  4. tests/ - tests right next to per-source-file structure. Standard layout.
  5. docs/ - Sphinx-based docs (conf.py, .rst files).
  6. .github/workflows/ - open the workflow YAML. CI runs on multiple Python versions; runs pytest, mypy, ruff.
  7. tox.ini - alternative test runner. tox -e py312 runs tests on Python 3.12.
  8. CONTRIBUTING.md - read it end to end.

Five minutes later, you have a map. You haven't read the implementation; you don't need to.

The conventions in CONTRIBUTING.md

Open the file and look for:

  • Setup instructions. Usually pip install -e .[dev] or pip install -e .[tests,docs].
  • How to run tests. pytest, tox, nox.
  • Code style. Usually "run pre-commit install and the rest is automated."
  • Type-checking. Run mypy or pyright - your PR must pass.
  • Commit message format. Some require Conventional Commits, most don't.
  • CHANGELOG. Some require you to add a line to CHANGELOG.md describing your change.
  • Sign-off / CLA. Some require git commit -s for DCO; some require signing a CLA via a bot.

Follow them. The maintainers will be relieved.

Exercise

Use the project you picked in page 13.

  1. Clone it locally.
  2. Walk the layout, file by file, mapping each piece to the categories above.
  3. Read CONTRIBUTING.md end to end.
  4. Open one CI workflow YAML in .github/workflows/. Identify: what commands does CI run? On what Python versions?
  5. Run those CI commands locally:
    python -m venv .venv && source .venv/bin/activate
    pip install -e .[dev]
    pytest
    ruff check .
    mypy src/
    
    Adjust to match whatever the project's CONTRIBUTING says.
  6. Open the issue you tentatively picked. Identify the three files most likely to be involved in the fix (guess based on file names and grep).

You're now ready to actually make a change.

What you might wonder

"What if a project doesn't follow the standard layout?" Some don't. Read the README and CONTRIBUTING.md; they'll explain. If neither does, follow the entry point and see where it leads.

"What's src/ vs no src/?" Cosmetic, but src/<pkg>/ prevents a subtle bug: you can accidentally import the local source instead of the installed package. Modern projects use src/; older ones often don't.

"What's __init_subclass__ and other dunders?" Magic methods. Recognize them; understand what each does when you need to.

"What's noxfile.py vs tox.ini?" Both are matrix runners (run tests across Python versions, dep versions). nox is Python-based config (more flexible); tox is INI (older). Pick whichever the project uses; don't mix.

"What if CI breaks on main when I clone?" A red flag about project health. Consider another project. At minimum, ask in the issue tracker whether main is in a known-broken state.

Done

You can now: - Recognize the typical Python project layout. - Locate every common file/folder by role. - Read CONTRIBUTING.md for conventions you'll need to follow. - Read CI workflows to know exactly what your PR will be measured against. - Make a confident guess at which files a given change will touch.

You're ready to actually do the thing.

Next: Your first contribution →

Comments