Back to blog

Process vs Thread: What Every Developer Should Know

operating-systemsconcurrencybackendpythongo
Process vs Thread: What Every Developer Should Know

Every developer hears "process" and "thread" constantly — in job interviews, system design discussions, and error messages. But many people treat them as interchangeable or have a fuzzy understanding of the difference.

This post cuts through the confusion. By the end, you'll know exactly what a process and a thread are, how they differ in memory and communication, and when to use each.

What You'll Learn

✅ What a process is and how the OS manages it
✅ What a thread is and how it relates to a process
✅ How memory is shared (or not) between them
✅ How they communicate
✅ The performance trade-offs of each
✅ Real code examples in Python, Go, and Java
✅ When to use processes vs threads


The Mental Model

Think of a process as a running application. When you open Chrome, that's a process. Open VS Code — another process. Each lives in its own isolated world.

A thread is a unit of execution inside a process. Chrome might have one thread rendering the page, another running JavaScript, and another handling network requests — all inside the same Chrome process.


What Is a Process?

A process is an independent program in execution. The operating system gives each process:

  • Its own virtual memory space (code, stack, heap, data)
  • Its own file descriptors (open files, sockets)
  • Its own CPU registers and program counter
  • A unique Process ID (PID)

Because each process has its own memory space, one process cannot accidentally read or corrupt another process's memory. This isolation is the key characteristic of processes.

Process lifecycle

A process goes through these states:

StateDescription
NewBeing created
ReadyWaiting to be assigned a CPU
RunningCurrently executing on a CPU
Blocked/WaitingWaiting for I/O or an event
TerminatedFinished execution

What Is a Thread?

A thread is the smallest unit of execution that the OS scheduler can manage. Every process has at least one thread — the main thread. Additional threads can be created to do work in parallel.

Threads within the same process share:

  • The heap (dynamic memory)
  • Global variables and static data
  • Open file descriptors
  • Code (the executable instructions)

But each thread has its own:

  • Stack (local variables, function call history)
  • Program counter (which instruction it's executing)
  • CPU registers
ComponentThread 1Thread 2Thread 3
Heap (shared)
Code (shared)
Global Variables (shared)
Stack (own)Stack 1Stack 2Stack 3
Program Counter (own)PC 1PC 2PC 3

Key Differences Side by Side

AspectProcessThread
MemoryIsolated — own address spaceShared within the process
Creation costHigh (OS must set up memory)Low (shares parent memory)
Context switchExpensive (save/restore full state)Cheaper (shares most state)
CommunicationIPC (pipes, sockets, shared mem)Direct (shared memory)
Crash impactOther processes unaffectedCan crash entire process
ParallelismTrue parallelism across CPUsDepends on language/GIL
IsolationStrongWeak

Memory: The Core Difference

This is the most important distinction. Let's make it concrete.

Processes: Isolated memory

# process_memory.py
import multiprocessing
 
counter = 0  # This is in the parent process's memory
 
def increment():
    global counter
    counter += 1
    print(f"Child process counter: {counter}")  # Prints 1
 
if __name__ == "__main__":
    p = multiprocessing.Process(target=increment)
    p.start()
    p.join()
    print(f"Parent process counter: {counter}")  # Still 0!
Child process counter: 1
Parent process counter: 0

The child process got a copy of the parent's memory (copy-on-write). Changes in the child do NOT affect the parent.

Threads: Shared memory

# thread_memory.py
import threading
 
counter = 0  # Shared between all threads
 
def increment():
    global counter
    counter += 1
    print(f"Thread counter: {counter}")
 
threads = [threading.Thread(target=increment) for _ in range(3)]
for t in threads:
    t.start()
for t in threads:
    t.join()
 
print(f"Final counter: {counter}")  # 3 (all threads modified same variable)
Thread counter: 1
Thread counter: 2
Thread counter: 3
Final counter: 3

All threads modify the same counter variable. This is powerful but dangerous — without synchronization, you get race conditions.


Communication

Inter-Process Communication (IPC)

Since processes can't share memory directly, they need explicit IPC mechanisms:

Pipes — one-directional data channel:

import multiprocessing
 
def producer(conn):
    conn.send("Hello from child process!")
    conn.close()
 
if __name__ == "__main__":
    parent_conn, child_conn = multiprocessing.Pipe()
    p = multiprocessing.Process(target=producer, args=(child_conn,))
    p.start()
    print(parent_conn.recv())  # "Hello from child process!"
    p.join()

Shared memory — explicitly allocated region both processes can access:

import multiprocessing
 
def worker(shared_val):
    shared_val.value += 10
 
if __name__ == "__main__":
    val = multiprocessing.Value('i', 0)  # Shared integer, starts at 0
    p = multiprocessing.Process(target=worker, args=(val,))
    p.start()
    p.join()
    print(val.value)  # 10

Other IPC mechanisms include: sockets, message queues, signals, and memory-mapped files.

Thread Communication

Threads communicate simply by reading/writing shared variables — but you must use locks to prevent race conditions:

import threading
 
counter = 0
lock = threading.Lock()
 
def safe_increment():
    global counter
    with lock:  # Only one thread enters at a time
        counter += 1
 
threads = [threading.Thread(target=safe_increment) for _ in range(1000)]
for t in threads:
    t.start()
for t in threads:
    t.join()
 
print(counter)  # Always 1000 (race-condition free)

Performance Trade-offs

Creation cost

Creating a process is significantly more expensive than creating a thread:

import time
import threading
import multiprocessing
 
def do_nothing():
    pass
 
# Benchmark thread creation
start = time.perf_counter()
threads = [threading.Thread(target=do_nothing) for _ in range(1000)]
for t in threads: t.start()
for t in threads: t.join()
thread_time = time.perf_counter() - start
 
# Benchmark process creation
start = time.perf_counter()
processes = [multiprocessing.Process(target=do_nothing) for _ in range(100)]
for p in processes: p.start()
for p in processes: p.join()
process_time = time.perf_counter() - start
 
print(f"1000 threads: {thread_time:.3f}s")
print(f"100 processes: {process_time:.3f}s")
# Threads are ~10-50x cheaper to create

Context switching

When the OS switches which thread/process runs on a CPU:

  • Thread switch: Save and restore registers + stack pointer. Shared memory stays in cache.
  • Process switch: Save and restore the entire memory mapping (page tables). CPU cache gets invalidated. Much more expensive.

Parallelism — the Python GIL

Python has the Global Interpreter Lock (GIL), which means only one thread runs Python code at a time — even on multi-core CPUs. For CPU-bound tasks in Python, threads don't help:

# CPU-bound: threads won't parallelize (GIL)
import threading, time
 
def count_to_million():
    count = 0
    while count < 1_000_000:
        count += 1
 
# Single-threaded
start = time.perf_counter()
count_to_million()
count_to_million()
print(f"Sequential: {time.perf_counter() - start:.2f}s")
 
# Multi-threaded (GIL makes this NOT faster)
start = time.perf_counter()
t1 = threading.Thread(target=count_to_million)
t2 = threading.Thread(target=count_to_million)
t1.start(); t2.start()
t1.join(); t2.join()
print(f"2 threads: {time.perf_counter() - start:.2f}s")  # Same or slower!

For CPU-bound Python code, use multiprocessing (each process has its own GIL):

import multiprocessing, time
 
def count_to_million():
    count = 0
    while count < 1_000_000:
        count += 1
 
start = time.perf_counter()
p1 = multiprocessing.Process(target=count_to_million)
p2 = multiprocessing.Process(target=count_to_million)
p1.start(); p2.start()
p1.join(); p2.join()
print(f"2 processes: {time.perf_counter() - start:.2f}s")  # ~2x faster on multi-core!

Go and Java don't have this limitation — their threads truly run in parallel on multiple CPUs.


Go: Goroutines (Lightweight Threads)

Go takes a different approach with goroutines — extremely lightweight "green threads" managed by the Go runtime (not the OS). You can run millions of goroutines with little overhead.

package main
 
import (
    "fmt"
    "sync"
)
 
func worker(id int, wg *sync.WaitGroup) {
    defer wg.Done()
    fmt.Printf("Worker %d running\n", id)
}
 
func main() {
    var wg sync.WaitGroup
 
    // Spawn 10 goroutines
    for i := 1; i <= 10; i++ {
        wg.Add(1)
        go worker(i, &wg)
    }
 
    wg.Wait()
    fmt.Println("All workers done")
}

Goroutines communicate via channels (instead of shared memory):

package main
 
import "fmt"
 
func produce(ch chan<- int) {
    for i := 0; i < 5; i++ {
        ch <- i  // Send to channel
    }
    close(ch)
}
 
func main() {
    ch := make(chan int)
    go produce(ch)
 
    for val := range ch {  // Receive from channel
        fmt.Println("Received:", val)
    }
}

Go's motto: "Do not communicate by sharing memory; share memory by communicating."


Java: Threads and Virtual Threads

Java uses OS threads by default. Since Java 21, virtual threads (Project Loom) offer lightweight concurrency similar to Go's goroutines:

// Traditional OS thread
Thread thread = new Thread(() -> {
    System.out.println("Running in OS thread: " + Thread.currentThread());
});
thread.start();
thread.join();
 
// Virtual thread (Java 21+) — much cheaper
Thread vThread = Thread.ofVirtual().start(() -> {
    System.out.println("Running in virtual thread: " + Thread.currentThread());
});
vThread.join();
 
// Spawn thousands of virtual threads easily
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (int i = 0; i < 10_000; i++) {
        int taskId = i;
        executor.submit(() -> {
            System.out.println("Task " + taskId);
        });
    }
}

When to Use Processes vs Threads

Use processes when:

  • Isolation matters — a crash in one worker should not take down others (web servers, microservices)
  • CPU-bound tasks in Python — bypass the GIL
  • Security boundaries — processes can have different permissions
  • Independent tasks — no need for tight communication
  • Running different programs — spawning a shell command, a separate binary

Real examples: Nginx worker processes, Python multiprocessing, Chrome tabs, Celery workers

Use threads when:

  • I/O-bound tasks — waiting on network, disk, database (threads idle cheaply)
  • Shared state is needed — threads naturally share memory
  • Low overhead matters — thread creation is cheap
  • Java/Go/C++/Rust — threads are truly parallel (no GIL)
  • Responsive UIs — keep the main thread free while background threads do work

Real examples: Web server request handlers, background tasks in a GUI app, database connection pools

Decision chart


Common Pitfalls

1. Race conditions (threads)

# BROKEN — race condition
import threading
 
balance = 1000
 
def withdraw(amount):
    global balance
    if balance >= amount:
        # Another thread can run here, causing double-spend!
        balance -= amount
 
threads = [threading.Thread(target=withdraw, args=(100,)) for _ in range(20)]
for t in threads: t.start()
for t in threads: t.join()
print(balance)  # Could be negative!

Fix: use a lock or threading.local() for thread-local state.

2. Deadlocks (threads)

import threading
 
lock_a = threading.Lock()
lock_b = threading.Lock()
 
def thread1():
    with lock_a:
        with lock_b:  # Thread 1 holds A, waits for B
            print("Thread 1 done")
 
def thread2():
    with lock_b:
        with lock_a:  # Thread 2 holds B, waits for A — DEADLOCK
            print("Thread 2 done")

Fix: always acquire locks in the same order.

3. Zombie processes

A process that has finished but whose parent hasn't called wait() becomes a zombie — it still has an entry in the process table.

import multiprocessing, time
 
def child():
    print("Child done")
 
p = multiprocessing.Process(target=child)
p.start()
# Always call p.join() or p.wait() to clean up!
p.join()

Summary

Process:

  • Independent program with its own memory
  • Isolated — crashes don't affect other processes
  • Expensive to create, expensive to switch
  • Communicate via IPC (pipes, sockets, shared memory)
  • Best for: crash isolation, CPU-bound Python, security boundaries

Thread:

  • Execution unit inside a process
  • Shares memory with siblings — fast communication, but risk of race conditions
  • Cheap to create, cheaper to switch
  • Communicate via shared memory + locks
  • Best for: I/O-bound tasks, shared state, Java/Go/Rust CPU-bound work

The rule of thumb: if you need isolation, use processes. If you need speed and shared state, use threads.

Understanding this distinction will help you design better systems, debug concurrency bugs faster, and make smarter architecture decisions.

📬 Subscribe to Newsletter

Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.

We respect your privacy. Unsubscribe at any time.

💬 Comments

Sign in to leave a comment

We'll never post without your permission.