Most Python test suites have the same problem: they cost more trouble than they save.
If you have ever touched automated tests, I'm sure you've encountered at least a few:
- CI jobs that take forever.
- Tests break every time you refactor - simple refactoring, 65 tests that need an update.
- You keep restarting CI pipelines because of some flaky test.
- You add a new property to your model, and 1/3 of your test suite needs updating.
So many tests and so much work, but bugs still make it to production somehow. So what's going on? Why do we test at all?
Why do we test?
It's not for coverage numbers. It's not because someone told us every method needs a test. It's because the businesses we work for need to ship changes - fast and safely.
The world changes. Requirements change. Tax rates change. Now, with all of the AI in our lives, things change faster than ever. And when they do, we need to update our code to match the new expectations without breaking everything else. Automated tests are the way to make that safe.
But that's the case only if the tests are actually valuable. Otherwise, we find ourselves in the 9th circle of Dante's testing hell.
You can read more about Dante's Divine Comedy here 😉
High quality tests
After years of writing (and deleting) tests, I've landed on 7 qualities that separate tests worth keeping from tests that just slow you down:
- Fast: If your test suite takes 20 minutes to run, you won't run it. Ergo, fast feedback is gone.
- Reliable: If you're restarting jobs because tests fail randomly, you can't trust your test suite. If you can't trust it, you can't claim that passing tests means we can deploy to production.
- Repeatable: Passes on your machine but fails on CI? Again, trust is gone.
- Resistant to refactoring: If you keep the behavior, tests should stay the same. Otherwise, you lack regression protection. If code and tests keep changing together, there's hardly any safety net. You simply always update your tests to pass.
- Readable: When a test fails, can you find the bug in 30 seconds? Or do you start a wild goose chase?
- Resistant to unrelated changes: Adding a field to a model shouldn't blow up tests that have nothing to do with it.
- Thorough: Passing tests means nothing if they don't fail when behavior is missing or unintentionally altered.
So if you feel like your tests are slowing you down, you might be right! Anyhow, the solution is not to write fewer (or more) tests. It's to write better tests. Even if that means fewer of them.
Patterns, patterns, patterns
The best way to start writing better tests is by learning patterns that make tests valuable by default - so you stop fighting your test suite and start trusting it. In some cases, this requires a significant mindset shift in how to write code and tests.
This includes patterns like:
- factory fixtures - so adding a field doesn't break 100 tests,
- behavior-focused testing - so refactoring doesn't mean rewriting tests,
- contract tests - so you can swap implementations without fear.
For example, here's how one pattern can save you from a headache and token waste. Imagine you have a User model with name and email. You've got 50 tests that create users because user object is needed (e.g., welcome email is sent, user can authenticate with password).
from dataclasses import dataclass
@dataclass
class User:
name: str
email: str
Now you add a new required field role. By default, all 50 tests will break. But none of them is testing anything related to roles. Claude Code will spend 5 minutes updating the boilerplate in tests. But that's pure waste of time and tokens.
With a factory fixture, on the other hand, zero existing tests break. The factory fixture can provide a sensible default for role, and only tests that actually care about roles need to explicitly set it. That's a single line change.
import pytest
from models import User
@pytest.fixture
def create_user() -> Callable[..., User]:
def _create_user(**kwargs) -> User:
user = User(
name=kwargs.get("name", "John Doe"),
email=kwargs.get("email", "john@doe.com"),
role=kwargs.get("role", "regular_user")
)
return user
return _create_user
And it's similar with other patterns. When applied correctly, they can simplify your (and your AI agent's) life by making it easier to develop new things while ensuring that existing ones don't break.
Conclusion
Low-quality tests don't protect us. They just increase frustration. High-quality tests, on the other hand, are making our lives easier - way easier! There are many patterns, like factory fixtures, that can help us write valuable tests. You can find all of them with detailed explanations and real-world examples in Complete Python Testing Guide.
Happy engineering!