05 - Classes¶

What this session is¶

About an hour. You'll learn how to define your own types in Python using classes. By the end you'll know how to bundle data together, attach behavior to it, and you'll see Python's self, __init__, and the slightly newer @dataclass shortcut.

The problem this solves¶

Every variable so far has held one value - one int, one str. Real things have many properties at once: a person has a name, an age, a city. You could pass each property as a separate parameter:

def describe(name, age, city):
    print(f"{name}, age {age}, lives in {city}")

That works for two or three properties. At six, you're sad. At twelve, you're lost. A class lets you bundle them.

A class¶

class Person:
    def __init__(self, name, age, city):
        self.name = name
        self.age = age
        self.city = city

alice = Person("Alice", 30, "Lagos")
print(alice.name)            # Alice
print(alice.age)             # 30
print(alice.city)            # Lagos

Type and run.

What's new:

class Person: - defines a new class called Person. The convention is CamelCase for class names.
def __init__(self, name, age, city): - a special method called when you create a Person. Its job is to set up the new object. The double-underscore name (__init__) is convention for "called by Python machinery, not by you directly."
self - the first parameter of every method. Refers to the object being operated on. You don't pass it explicitly when calling; Python supplies it.
self.name = name - set the new object's name attribute to the value passed in.
Person("Alice", 30, "Lagos") - create a new Person. Python calls __init__ for you, passing your arguments. The result is the newly-built object.
alice.name - read an attribute. (Setting works the same way: alice.name = "Alicia".)

Methods¶

A method is a function defined inside a class. Like __init__, it takes self as the first parameter:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def greet(self):
        return f"Hi, I'm {self.name}"

    def birthday(self):
        self.age += 1

alice = Person("Alice", 30)
print(alice.greet())     # Hi, I'm Alice
alice.birthday()
print(alice.age)         # 31

Methods can read and modify self's attributes. birthday() mutates alice in place - no need to return anything.

When you call alice.greet(), Python implicitly passes alice as self. You write greet(self); you call alice.greet(). Don't get tripped up by this.

A useful trick: `repr`¶

If you print an object without overriding anything, you get something ugly: <__main__.Person object at 0x10502f140>. Useless.

Define __repr__ to make it print nicely:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __repr__(self):
        return f"Person(name={self.name!r}, age={self.age})"

alice = Person("Alice", 30)
print(alice)     # Person(name='Alice', age=30)

!r in an f-string calls repr() on the value - which for strings adds quotes. Useful for debug output.

(There's also __str__, used by str(obj). If you only define __repr__, it's used for both. Define __repr__ first; add __str__ only if you need a different "friendly" form.)

A modern shortcut: `@dataclass`¶

The class above has a lot of boilerplate. Almost every class with data in it starts the same way: take values in __init__, store as attributes, add __repr__. Python has a built-in shortcut: @dataclass.

from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int
    city: str = "unknown"

alice = Person("Alice", 30, "Lagos")
print(alice)            # Person(name='Alice', age=30, city='Lagos')
print(alice.name)       # Alice

The @dataclass decorator (a thing applied to a class - page 08 explains decorators) auto-generates __init__, __repr__, and equality (==). You declare the fields as class-level annotations with optional defaults.

Modern Python code uses @dataclass heavily for "things with data and not much else." Reach for it before writing a class with a long __init__ of self.x = x lines.

You can still add methods normally:

@dataclass
class Person:
    name: str
    age: int

    def greet(self) -> str:
        return f"Hi, I'm {self.name}"

Inheritance (briefly)¶

Classes can inherit from other classes - pick up their attributes and methods. Used heavily in some codebases, sparingly in others.

class Animal:
    def __init__(self, name):
        self.name = name

    def speak(self):
        return "(generic animal sound)"

class Dog(Animal):
    def speak(self):
        return f"{self.name} says woof"

class Cat(Animal):
    def speak(self):
        return f"{self.name} says meow"

for pet in [Dog("Rex"), Cat("Whiskers")]:
    print(pet.speak())

Output:

Rex says woof
Whiskers says meow

class Dog(Animal): means "Dog is an Animal, plus customizations." Dog inherits __init__ from Animal (so we didn't need to define it). The speak method in Dog overrides the one in Animal.

Modern Python advice: prefer composition over inheritance. Inheritance is a tight coupling that bites later. Use it when the relationship is naturally "is-a" (Dog IS a Animal); reach for "has-a" (a Garage HAS a Car) by storing instances as attributes instead.

Exercise¶

In a new file shapes.py:

Define a class Rectangle with two attributes: width and height.
Add an __init__ taking both as parameters.
Add an area() method returning width * height.
Add a perimeter() method returning 2 * (width + height).
Create a Rectangle(5, 3). Print its area and perimeter. Expected: 15 and 16.
Now rewrite it as a @dataclass - should be ~5 lines.
Stretch: add a __repr__ (or let @dataclass give you one). Print a rectangle; confirm the output is readable.
Stretch: write a function larger_of(a, b) that returns whichever rectangle has the bigger area. Test with two rectangles.

What you might wonder¶

"Why self? Other languages use this." Convention from the language's first design (1991). The Python community settled on self; you'll see it everywhere. You can name it differently - this, obj, anything - but don't. Sticking to self is one of the strongest conventions in Python.

"What's __init__ vs __new__?" __init__ initializes an already-created object. __new__ actually creates the object. You will essentially never write __new__. Forget it for now.

"What if I don't write __init__?" You get a default one that takes no arguments. You can still set attributes after creation: p = Person(); p.name = "Alice". Useful sometimes; less clear than constructor-injection.

"Are there private attributes?" Not enforced. By convention, attributes starting with _ (one underscore) are "internal - don't touch from outside." Attributes starting with __ (two underscores) get name-mangled to discourage external access. Python's philosophy: "we're all consenting adults" - the convention is a contract, not a wall.

"Should I use @dataclass everywhere?" For "things with data and minor logic" - yes. For things with significant behavior, or that need custom validation, or that don't quite fit the dataclass mold - regular classes are fine. Mixing both in a project is normal.

Done¶

You can now: - Define your own types with class. - Use __init__ to set up new objects. - Read and write attributes via self.x. - Define methods that operate on self. - Add __repr__ for useful debug output. - Use @dataclass for the common "bundle of data" case. - Know that inheritance exists; prefer composition.

You can now model real things - people, accounts, points, rectangles, anything with structure. Combined with what came before, you can write programs that work with non-trivial domains.

Next page: how Python handles collections - many things at once.

Next: Collections →