Back to blog

Unit Testing Best Practices That Actually Matter

testingunit-testsoftware-engineeringbest-practicespython
Unit Testing Best Practices That Actually Matter

Writing unit tests is one of those things that feels like a chore — until a "small change" breaks the entire system. Then those tests become your best friends.

But here's the thing most tutorials don't tell you: bad tests are worse than no tests. They give you false confidence, slow down your team, and break every time you refactor. I've spent years writing both kinds, and I want to share what actually works.

What You'll Learn:
✅ The AAA pattern for structuring every test
✅ Why each test should have exactly one reason to fail
✅ How to keep tests isolated and deterministic
✅ When to mock — and when not to
✅ The "magic 80%" approach to code coverage
✅ Real Python code examples throughout

The AAA Pattern — Structure Every Test the Same Way

The Arrange-Act-Assert pattern is the gold standard for test structure. It keeps your tests readable and ensures anyone on the team understands what's being verified.

# Arrange - Set up the objects, data, and preconditions
def test_apply_discount_reduces_price():
    original_price = 100.0
    discount_percent = 20
 
    # Act - Execute the function you're testing
    result = apply_discount(original_price, discount_percent)
 
    # Assert - Verify the outcome
    assert result == 80.0

Every test follows the same rhythm:

  1. Arrange: Set up objects, data, and preconditions
  2. Act: Execute the specific function or method you're testing
  3. Assert: Verify the outcome matches your expectations

When someone reads your test, they should immediately understand: "Here's the setup, here's what we're doing, and here's what should happen."

Test One Thing at a Time

A unit test should have a single reason to fail. If you're asserting ten different things in one test, it becomes impossible to diagnose what went wrong.

# ❌ Bad — testing too many things at once
def test_user_registration():
    user = register_user("alice", "alice@example.com", "password123")
    assert user.username == "alice"
    assert user.email == "alice@example.com"
    assert user.is_active is True
    assert user.role == "member"
    assert len(user.permissions) == 3
    assert user.created_at is not None
# ✅ Good — separate tests for separate concerns
def test_register_user_sets_username():
    user = register_user("alice", "alice@example.com", "password123")
    assert user.username == "alice"
 
def test_register_user_is_active_by_default():
    user = register_user("alice", "alice@example.com", "password123")
    assert user.is_active is True
 
def test_register_user_has_member_role():
    user = register_user("alice", "alice@example.com", "password123")
    assert user.role == "member"

When a test with one assertion fails, the name alone tells you what's broken. When a test with ten assertions fails, you have to read the traceback, find the line number, and figure out which of the ten things went wrong.

Keep Tests Isolated and Deterministic

A test should never depend on the result of another test. Run them in any order, run them in parallel — the result should always be the same.

No Shared Mutable State

# ❌ Bad — tests share state through a global list
cart_items = []
 
def test_add_item():
    cart_items.append("laptop")
    assert len(cart_items) == 1
 
def test_add_another_item():
    cart_items.append("mouse")
    assert len(cart_items) == 2  # Depends on test_add_item running first!
# ✅ Good — each test creates its own state
def test_add_item():
    cart = ShoppingCart()
    cart.add("laptop")
    assert cart.count() == 1
 
def test_add_another_item():
    cart = ShoppingCart()
    cart.add("mouse")
    assert cart.count() == 1  # Independent, deterministic

Fast Execution

Unit tests should run in milliseconds, not seconds. If your tests take seconds each, you're probably doing integration testing — hitting a real database, calling an API, or reading from the filesystem.

Test TypeTypical SpeedWhat It Tests
Unit Test1–10 msA single function or class
Integration Test100–1000 msMultiple components together
End-to-End Test1–30 secondsFull user flow through the system

Mock External Dependencies

When your code talks to a database, an API, or the filesystem, use mocks or stubs to simulate those interactions.

from unittest.mock import Mock
 
def test_get_user_profile_returns_formatted_name():
    # Arrange — mock the database
    mock_db = Mock()
    mock_db.find_user.return_value = {"first_name": "Alice", "last_name": "Smith"}
 
    service = UserService(db=mock_db)
 
    # Act
    profile = service.get_profile(user_id=42)
 
    # Assert
    assert profile.display_name == "Alice Smith"
    mock_db.find_user.assert_called_once_with(42)

The test doesn't need a real database. It verifies that your code correctly formats the name — not whether PostgreSQL is running.

Name Your Tests Expressively

The name of your test should read like a sentence describing a requirement. When a test fails in CI, the name alone should tell you what's broken.

A common pattern: MethodName_StateUnderTest_ExpectedBehavior

# ❌ Bad — tells you nothing
def test1():
    ...
 
def test_login():
    ...
 
# ✅ Good — reads like a specification
def test_login_with_empty_password_returns_error():
    ...
 
def test_login_with_expired_token_redirects_to_signin():
    ...
 
def test_calculate_tax_with_zero_amount_returns_zero():
    ...

When you see a test report like this, you can immediately understand the codebase's behavior:

✅ test_login_with_valid_credentials_returns_token
✅ test_login_with_wrong_password_returns_401
❌ test_login_with_expired_token_redirects_to_signin
✅ test_login_with_locked_account_returns_403

You know exactly which scenario is broken without reading a single line of test code.

Test the Public API, Not the Implementation

Avoid testing private methods directly. If you feel the need to test a private method, it's usually a sign that the class is doing too much — that logic should probably live in its own class with a public interface.

# ❌ Bad — testing internal implementation details
def test_user_service_internal_hash_password():
    service = UserService()
    hashed = service._hash_password("secret")  # Accessing private method
    assert hashed.startswith("$2b$")
 
# ✅ Good — testing the behavior through the public API
def test_user_service_create_user_stores_hashed_password():
    service = UserService(db=mock_db)
    service.create_user("alice", "secret")
 
    stored_password = mock_db.save_user.call_args[0][1]
    assert stored_password != "secret"  # It's hashed, not plaintext

Why this matters: When you refactor internals (changing the hashing algorithm, renaming private methods), tests that check behavior still pass. Tests that check implementation break immediately — even though nothing is actually wrong.

Isolate From Frameworks and External Services

Think of your code like a pilot in a flight simulator. You want to test if the pilot can land the plane (your logic) — without needing a real runway (the framework and external services).

External Services — Strictly Isolate

If your test calls a real database or a third-party API, it's not a unit test anymore — it's an integration test.

# ❌ Integration test pretending to be a unit test
def test_get_weather():
    result = weather_service.get_current("Ho Chi Minh City")  # Calls real API
    assert result["temperature"] > 0
 
# ✅ Actual unit test — mock the HTTP call
def test_get_weather_parses_response():
    mock_client = Mock()
    mock_client.get.return_value = {"temp": 32, "unit": "celsius"}
 
    service = WeatherService(client=mock_client)
    result = service.get_current("Ho Chi Minh City")
 
    assert result["temperature"] == 32

Frameworks — Keep Business Logic Separate

You want to avoid "framework bloat" in your tests. If you need to boot up an entire application context to check a math function, your architecture is too tightly coupled.

# ❌ Needs FastAPI running to test business logic
from fastapi.testclient import TestClient
from main import app
 
def test_discount_calculation():
    client = TestClient(app)
    response = client.post("/api/discount", json={"price": 100, "percent": 20})
    assert response.json()["final_price"] == 80.0
 
# ✅ Test the logic directly — no framework needed
def test_discount_calculation():
    result = calculate_discount(price=100, percent=20)
    assert result == 80.0

Standard Libraries — Don't Mock Everything

You don't need to mock len(), str.split(), or math.sqrt(). They're fast, stable, and deterministic. Only mock things that are slow, non-deterministic, or have side effects.

Target the "Magic 80%" — Not 100%

Don't obsess over 100% code coverage. Aiming for 100% leads to testing trivial things (like getters and setters) while ignoring complex edge cases.

Focus your energy on three areas:

1. Happy Paths

The expected, normal flow through your code.

def test_transfer_money_succeeds():
    sender = Account(balance=1000)
    receiver = Account(balance=500)
 
    transfer(sender, receiver, amount=200)
 
    assert sender.balance == 800
    assert receiver.balance == 700

2. Boundary Conditions

What happens at 0, -1, the maximum allowed value, or an empty list?

def test_transfer_exact_balance_succeeds():
    sender = Account(balance=100)
    receiver = Account(balance=0)
 
    transfer(sender, receiver, amount=100)
 
    assert sender.balance == 0
    assert receiver.balance == 100
 
def test_transfer_zero_amount_does_nothing():
    sender = Account(balance=100)
    receiver = Account(balance=50)
 
    transfer(sender, receiver, amount=0)
 
    assert sender.balance == 100
    assert receiver.balance == 50

3. Error Handling

Does it raise the right exception when things go wrong?

import pytest
 
def test_transfer_insufficient_funds_raises_error():
    sender = Account(balance=50)
    receiver = Account(balance=100)
 
    with pytest.raises(InsufficientFundsError):
        transfer(sender, receiver, amount=200)
 
def test_transfer_negative_amount_raises_error():
    sender = Account(balance=100)
    receiver = Account(balance=50)
 
    with pytest.raises(ValueError):
        transfer(sender, receiver, amount=-50)

Quick Reference: Good vs. Bad Tests

FeatureGood Unit TestBad Unit Test
SpeedRuns in millisecondsSlow — waits for I/O
ReliabilityDeterministic — same result every timeFlaky — randomly fails
ScopeSmall, focused unit of logicCrosses multiple layers
MaintenanceEasy to update when logic changesBrittle — breaks on minor refactors
Namingtest_login_with_empty_password_returns_errortest1 or test_login
AssertionsOne logical concept per testTen assertions in one test
DependenciesMocked — no real DB or APIRequires running services

The Mindset Shift

Writing good tests isn't about achieving a coverage number. It's about building confidence.

Every test you write should answer one question: "If someone changes this code tomorrow, will this test catch it if something breaks?"

If the answer is yes — it's a good test. If the answer is "it'll probably break even if nothing is wrong" — that's a bad test, and you should delete it.

"A test that never fails is worthless. A test that always fails is worthless. A test that fails only when something is actually broken — that's gold."

The goal isn't to have more tests. It's to have the right tests — fast, reliable, focused, and actually useful.

📬 Subscribe to Newsletter

Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.

We respect your privacy. Unsubscribe at any time.

💬 Comments

Sign in to leave a comment

We'll never post without your permission.