Unit Testing Best Practices That Actually Matter

Writing unit tests is one of those things that feels like a chore — until a "small change" breaks the entire system. Then those tests become your best friends.
But here's the thing most tutorials don't tell you: bad tests are worse than no tests. They give you false confidence, slow down your team, and break every time you refactor. I've spent years writing both kinds, and I want to share what actually works.
What You'll Learn:
✅ The AAA pattern for structuring every test
✅ Why each test should have exactly one reason to fail
✅ How to keep tests isolated and deterministic
✅ When to mock — and when not to
✅ The "magic 80%" approach to code coverage
✅ Real Python code examples throughout
The AAA Pattern — Structure Every Test the Same Way
The Arrange-Act-Assert pattern is the gold standard for test structure. It keeps your tests readable and ensures anyone on the team understands what's being verified.
# Arrange - Set up the objects, data, and preconditions
def test_apply_discount_reduces_price():
original_price = 100.0
discount_percent = 20
# Act - Execute the function you're testing
result = apply_discount(original_price, discount_percent)
# Assert - Verify the outcome
assert result == 80.0Every test follows the same rhythm:
- Arrange: Set up objects, data, and preconditions
- Act: Execute the specific function or method you're testing
- Assert: Verify the outcome matches your expectations
When someone reads your test, they should immediately understand: "Here's the setup, here's what we're doing, and here's what should happen."
Test One Thing at a Time
A unit test should have a single reason to fail. If you're asserting ten different things in one test, it becomes impossible to diagnose what went wrong.
# ❌ Bad — testing too many things at once
def test_user_registration():
user = register_user("alice", "alice@example.com", "password123")
assert user.username == "alice"
assert user.email == "alice@example.com"
assert user.is_active is True
assert user.role == "member"
assert len(user.permissions) == 3
assert user.created_at is not None# ✅ Good — separate tests for separate concerns
def test_register_user_sets_username():
user = register_user("alice", "alice@example.com", "password123")
assert user.username == "alice"
def test_register_user_is_active_by_default():
user = register_user("alice", "alice@example.com", "password123")
assert user.is_active is True
def test_register_user_has_member_role():
user = register_user("alice", "alice@example.com", "password123")
assert user.role == "member"When a test with one assertion fails, the name alone tells you what's broken. When a test with ten assertions fails, you have to read the traceback, find the line number, and figure out which of the ten things went wrong.
Keep Tests Isolated and Deterministic
A test should never depend on the result of another test. Run them in any order, run them in parallel — the result should always be the same.
No Shared Mutable State
# ❌ Bad — tests share state through a global list
cart_items = []
def test_add_item():
cart_items.append("laptop")
assert len(cart_items) == 1
def test_add_another_item():
cart_items.append("mouse")
assert len(cart_items) == 2 # Depends on test_add_item running first!# ✅ Good — each test creates its own state
def test_add_item():
cart = ShoppingCart()
cart.add("laptop")
assert cart.count() == 1
def test_add_another_item():
cart = ShoppingCart()
cart.add("mouse")
assert cart.count() == 1 # Independent, deterministicFast Execution
Unit tests should run in milliseconds, not seconds. If your tests take seconds each, you're probably doing integration testing — hitting a real database, calling an API, or reading from the filesystem.
| Test Type | Typical Speed | What It Tests |
|---|---|---|
| Unit Test | 1–10 ms | A single function or class |
| Integration Test | 100–1000 ms | Multiple components together |
| End-to-End Test | 1–30 seconds | Full user flow through the system |
Mock External Dependencies
When your code talks to a database, an API, or the filesystem, use mocks or stubs to simulate those interactions.
from unittest.mock import Mock
def test_get_user_profile_returns_formatted_name():
# Arrange — mock the database
mock_db = Mock()
mock_db.find_user.return_value = {"first_name": "Alice", "last_name": "Smith"}
service = UserService(db=mock_db)
# Act
profile = service.get_profile(user_id=42)
# Assert
assert profile.display_name == "Alice Smith"
mock_db.find_user.assert_called_once_with(42)The test doesn't need a real database. It verifies that your code correctly formats the name — not whether PostgreSQL is running.
Name Your Tests Expressively
The name of your test should read like a sentence describing a requirement. When a test fails in CI, the name alone should tell you what's broken.
A common pattern: MethodName_StateUnderTest_ExpectedBehavior
# ❌ Bad — tells you nothing
def test1():
...
def test_login():
...
# ✅ Good — reads like a specification
def test_login_with_empty_password_returns_error():
...
def test_login_with_expired_token_redirects_to_signin():
...
def test_calculate_tax_with_zero_amount_returns_zero():
...When you see a test report like this, you can immediately understand the codebase's behavior:
✅ test_login_with_valid_credentials_returns_token
✅ test_login_with_wrong_password_returns_401
❌ test_login_with_expired_token_redirects_to_signin
✅ test_login_with_locked_account_returns_403You know exactly which scenario is broken without reading a single line of test code.
Test the Public API, Not the Implementation
Avoid testing private methods directly. If you feel the need to test a private method, it's usually a sign that the class is doing too much — that logic should probably live in its own class with a public interface.
# ❌ Bad — testing internal implementation details
def test_user_service_internal_hash_password():
service = UserService()
hashed = service._hash_password("secret") # Accessing private method
assert hashed.startswith("$2b$")
# ✅ Good — testing the behavior through the public API
def test_user_service_create_user_stores_hashed_password():
service = UserService(db=mock_db)
service.create_user("alice", "secret")
stored_password = mock_db.save_user.call_args[0][1]
assert stored_password != "secret" # It's hashed, not plaintextWhy this matters: When you refactor internals (changing the hashing algorithm, renaming private methods), tests that check behavior still pass. Tests that check implementation break immediately — even though nothing is actually wrong.
Isolate From Frameworks and External Services
Think of your code like a pilot in a flight simulator. You want to test if the pilot can land the plane (your logic) — without needing a real runway (the framework and external services).
External Services — Strictly Isolate
If your test calls a real database or a third-party API, it's not a unit test anymore — it's an integration test.
# ❌ Integration test pretending to be a unit test
def test_get_weather():
result = weather_service.get_current("Ho Chi Minh City") # Calls real API
assert result["temperature"] > 0
# ✅ Actual unit test — mock the HTTP call
def test_get_weather_parses_response():
mock_client = Mock()
mock_client.get.return_value = {"temp": 32, "unit": "celsius"}
service = WeatherService(client=mock_client)
result = service.get_current("Ho Chi Minh City")
assert result["temperature"] == 32Frameworks — Keep Business Logic Separate
You want to avoid "framework bloat" in your tests. If you need to boot up an entire application context to check a math function, your architecture is too tightly coupled.
# ❌ Needs FastAPI running to test business logic
from fastapi.testclient import TestClient
from main import app
def test_discount_calculation():
client = TestClient(app)
response = client.post("/api/discount", json={"price": 100, "percent": 20})
assert response.json()["final_price"] == 80.0
# ✅ Test the logic directly — no framework needed
def test_discount_calculation():
result = calculate_discount(price=100, percent=20)
assert result == 80.0Standard Libraries — Don't Mock Everything
You don't need to mock len(), str.split(), or math.sqrt(). They're fast, stable, and deterministic. Only mock things that are slow, non-deterministic, or have side effects.
Target the "Magic 80%" — Not 100%
Don't obsess over 100% code coverage. Aiming for 100% leads to testing trivial things (like getters and setters) while ignoring complex edge cases.
Focus your energy on three areas:
1. Happy Paths
The expected, normal flow through your code.
def test_transfer_money_succeeds():
sender = Account(balance=1000)
receiver = Account(balance=500)
transfer(sender, receiver, amount=200)
assert sender.balance == 800
assert receiver.balance == 7002. Boundary Conditions
What happens at 0, -1, the maximum allowed value, or an empty list?
def test_transfer_exact_balance_succeeds():
sender = Account(balance=100)
receiver = Account(balance=0)
transfer(sender, receiver, amount=100)
assert sender.balance == 0
assert receiver.balance == 100
def test_transfer_zero_amount_does_nothing():
sender = Account(balance=100)
receiver = Account(balance=50)
transfer(sender, receiver, amount=0)
assert sender.balance == 100
assert receiver.balance == 503. Error Handling
Does it raise the right exception when things go wrong?
import pytest
def test_transfer_insufficient_funds_raises_error():
sender = Account(balance=50)
receiver = Account(balance=100)
with pytest.raises(InsufficientFundsError):
transfer(sender, receiver, amount=200)
def test_transfer_negative_amount_raises_error():
sender = Account(balance=100)
receiver = Account(balance=50)
with pytest.raises(ValueError):
transfer(sender, receiver, amount=-50)Quick Reference: Good vs. Bad Tests
| Feature | Good Unit Test | Bad Unit Test |
|---|---|---|
| Speed | Runs in milliseconds | Slow — waits for I/O |
| Reliability | Deterministic — same result every time | Flaky — randomly fails |
| Scope | Small, focused unit of logic | Crosses multiple layers |
| Maintenance | Easy to update when logic changes | Brittle — breaks on minor refactors |
| Naming | test_login_with_empty_password_returns_error | test1 or test_login |
| Assertions | One logical concept per test | Ten assertions in one test |
| Dependencies | Mocked — no real DB or API | Requires running services |
The Mindset Shift
Writing good tests isn't about achieving a coverage number. It's about building confidence.
Every test you write should answer one question: "If someone changes this code tomorrow, will this test catch it if something breaks?"
If the answer is yes — it's a good test. If the answer is "it'll probably break even if nothing is wrong" — that's a bad test, and you should delete it.
"A test that never fails is worthless. A test that always fails is worthless. A test that fails only when something is actually broken — that's gold."
The goal isn't to have more tests. It's to have the right tests — fast, reliable, focused, and actually useful.
📬 Subscribe to Newsletter
Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.
We respect your privacy. Unsubscribe at any time.
💬 Comments
Sign in to leave a comment
We'll never post without your permission.