15 - Your First Contribution¶
What this session is¶
The whole thing. Walk through an AI OSS contribution end-to-end.
The workflow¶
- Fork on GitHub.
- Clone your fork.
- Add upstream as remote.
- Branch off main.
- Set up the dev environment (install with extras; run tests).
- Change the file(s).
- Run lint + tests locally.
- Push to your fork; open PR.
Step 1: Fork & clone¶
git clone git@github.com:<you>/peft.git
cd peft
git remote add upstream git@github.com:huggingface/peft.git
git fetch upstream
Step 2: Branch¶
Always a fresh branch off main.
Step 3: Set up dev environment¶
For most HF projects:
For projects requiring GPU, run only CPU tests first:
If anything fails on a fresh clone, fix that first or ask in the issue.
Step 4: Make the change¶
Small. Focused. Tested.
- Docs typo / clarification - edit the
.mdfile indocs/source/. - Add an example - add a new file under
examples/. - Fix a bug - change the code; add or update a test that proves the fix.
For a first PR, prefer the first two. Bug fixes are great once you know the project.
Step 5: Re-run CI's commands locally¶
Look in .github/workflows/tests.yml. Typical:
All green? Push. Red? Fix locally first.
Step 6: Commit and push¶
DCO if required (git commit -s).
Step 7: Open the PR¶
On upstream repo, "Compare & pull request."
- Title. Short, descriptive. Conventional-commit style if the project uses it.
- Description. What changed, why, how tested.
Closes #123references the issue. - Checklist. Address every item in the PR template.
Submit. CI runs. Fix anything red by pushing more commits.
Worked example: typo in PEFT LoRA docs¶
Suppose you noticed docs/source/conceptual_guides/lora.md has an outdated target_modules=["query_key_value"] example that no longer applies to current Llama configs.
git clone git@github.com:<you>/peft.git
cd peft
git remote add upstream git@github.com:huggingface/peft.git
git fetch upstream
git checkout -b docs/lora-target-modules-llama
# Edit docs/source/conceptual_guides/lora.md
# Add a note: "For Llama-style models, use ['q_proj','v_proj']."
make quality
make docs
git add docs/source/conceptual_guides/lora.md
git commit -m "docs: clarify LoRA target_modules for Llama-style models"
git push origin docs/lora-target-modules-llama
Open PR. Wait for review.
What review looks like¶
- "LGTM, merging." Done.
- "Could you change these?" Address. Push commits to same branch.
- "Not quite - we already have a section for this." Update or close.
- Silence for a week → polite check-in comment.
HF teams are responsive (usually within days).
After the merge¶
- Update your fork's
main: - Delete the branch.
- Take a screenshot.
- Sit with it.
After your first PR¶
- Pick another issue. Familiarity compounds - second is much easier.
- After 3-5 PRs in one project, become a regular. Review others' PRs.
- Pick a model architecture you care about. Contribute an integration.
- Move toward research code: paper implementations, training-script improvements.
What you might wonder¶
"PR sits for weeks?" HF responds fast. Other AI projects (research orgs, slower-paced labs) can take longer. Polite check-in after 7-10 days.
"What about PyTorch core?"
Larger surface, more rigorous review. CLA required, RFCs for non-trivial changes. Start with the docs/ tree there.
"What about OpenAI / Anthropic SDKs?"
Yes, they accept PRs to their clients (openai-python, anthropic-sdk-python). Closed-source models, open-source clients.
"Maintainer rude?" Disengage. Try another project. AI OSS has many welcoming homes.
Done with this path¶
You've: - Installed PyTorch and the AI Python stack. - Trained a small neural net on MNIST. - Used Hugging Face for text generation. - Fine-tuned a model with LoRA. - Built a small RAG pipeline. - Evaluated outputs honestly. - Served a model locally. - Read a real AI OSS project. - Submitted a PR.
What you should do next: build a small AI tool you actually want to exist. The technology rewards practice. Pick one problem, build the simplest possible solution, iterate.
Recommended next paths on this site:
- AI Systems Engineering (senior reference) - 24-week deep dive: kernels, distributed training, inference serving, evaluation infrastructure.
- Python from Scratch - if your Python feels shaky.
- Linux from Scratch - the substrate AI runs on.
- Kubernetes from Scratch - where AI serving infra lives.
Congratulations. You are no longer a beginner.