Go Goroutines and Concurrency Fundamentals

Goroutines are one of Go's most powerful features, making concurrent programming simple and efficient. In this comprehensive guide, you'll learn how to write concurrent Go programs, understand the runtime scheduler, and build production-ready applications.

What Are Goroutines?

A goroutine is a lightweight thread managed by the Go runtime. Unlike OS threads, goroutines are incredibly cheap to create and use minimal memory.

Key Characteristics:

Lightweight: Start with only 2KB of stack space (grows as needed)
Cheap to create: Spawn thousands or even millions of goroutines
Managed by Go runtime: No manual thread management
Communicates via channels: Share memory by communicating

Your First Goroutine

package main
 
import (
    "fmt"
    "time"
)
 
func sayHello() {
    fmt.Println("Hello from goroutine!")
}
 
func main() {
    // Launch goroutine with 'go' keyword
    go sayHello()
 
    // Give goroutine time to execute
    time.Sleep(100 * time.Millisecond)
 
    fmt.Println("Main function")
}

Output:

Hello from goroutine!
Main function

The go keyword launches a new goroutine that runs concurrently with the calling code.

Goroutines vs Threads

Traditional OS Threads

Memory per thread:     ~1-2 MB
Creation cost:         Expensive (system call)
Context switching:     Slow (kernel involvement)
Max threads:           ~10,000 (depends on OS/memory)

Go Goroutines

Memory per goroutine:  ~2 KB initial (grows dynamically)
Creation cost:         Very cheap (user space)
Context switching:     Fast (Go scheduler)
Max goroutines:        Millions (depends on memory)

Example: Creating 10,000 Goroutines

package main
 
import (
    "fmt"
    "sync"
)
 
func main() {
    var wg sync.WaitGroup
 
    for i := 0; i < 10000; i++ {
        wg.Add(1)
        go func(id int) {
            defer wg.Done()
            // Goroutine does some work
            _ = id * 2
        }(i)
    }
 
    wg.Wait()
    fmt.Println("All 10,000 goroutines completed!")
}

This completes in milliseconds on modern hardware. Try doing this with OS threads!

How Goroutines Work: The Go Scheduler

Go uses an M:N scheduler - it multiplexes M goroutines onto N OS threads.

The GMP Model

Components:

G (Goroutine): The lightweight execution unit
M (Machine): An OS thread
P (Processor): A scheduling context (logical CPU)

┌─────────────────────────────────────┐
│         Go Runtime Scheduler         │
├─────────────────────────────────────┤
│  P (Processor 1)    P (Processor 2) │
│      ↓                    ↓          │
│  M (OS Thread 1)    M (OS Thread 2) │
│      ↓                    ↓          │
│  G₁, G₂, G₃...      G₄, G₅, G₆...   │
└─────────────────────────────────────┘

How It Works:

Each P has a local run queue of goroutines
M (OS thread) executes goroutines from P's queue
When a goroutine blocks (I/O, syscall), the M can be detached
Work stealing: Idle P steals goroutines from busy P's queue

GOMAXPROCS: Controlling Parallelism

GOMAXPROCS sets the number of P (processors) available:

package main
 
import (
    "fmt"
    "runtime"
)
 
func main() {
    // Get current GOMAXPROCS
    fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))
 
    // Set to 4 processors
    runtime.GOMAXPROCS(4)
 
    // Or use all available CPUs (default behavior)
    runtime.GOMAXPROCS(runtime.NumCPU())
}

Default: GOMAXPROCS = number of CPU cores

When to adjust:

Usually don't change the default
Reduce for CPU-bound workloads to avoid oversubscription
Increase rarely helps (Go scheduler is smart)

Goroutine Lifecycle

package main
 
import (
    "fmt"
    "time"
)
 
func worker(id int) {
    fmt.Printf("Worker %d: Starting\n", id)
 
    // Simulate work
    time.Sleep(1 * time.Second)
 
    fmt.Printf("Worker %d: Done\n", id)
}
 
func main() {
    fmt.Println("Main: Starting")
 
    // Launch 3 goroutines
    for i := 1; i <= 3; i++ {
        go worker(i)
    }
 
    // Wait for goroutines to finish
    time.Sleep(2 * time.Second)
 
    fmt.Println("Main: Exiting")
}

Output:

Main: Starting
Worker 1: Starting
Worker 2: Starting
Worker 3: Starting
Worker 1: Done
Worker 2: Done
Worker 3: Done
Main: Exiting

Important: If main() exits, all goroutines are terminated immediately!

Synchronizing Goroutines with WaitGroups

Using time.Sleep() is not reliable. Use sync.WaitGroup instead:

Basic WaitGroup Usage

package main
 
import (
    "fmt"
    "sync"
    "time"
)
 
func worker(id int, wg *sync.WaitGroup) {
    defer wg.Done() // Decrement counter when done
 
    fmt.Printf("Worker %d: Starting\n", id)
    time.Sleep(1 * time.Second)
    fmt.Printf("Worker %d: Done\n", id)
}
 
func main() {
    var wg sync.WaitGroup
 
    for i := 1; i <= 3; i++ {
        wg.Add(1) // Increment counter
        go worker(i, &wg)
    }
 
    wg.Wait() // Block until counter reaches 0
    fmt.Println("All workers completed")
}

WaitGroup Methods:

Add(delta int): Increment counter by delta
Done(): Decrement counter by 1 (same as Add(-1))
Wait(): Block until counter reaches 0

Common WaitGroup Patterns

Pattern 1: Pass WaitGroup by Pointer

func worker(id int, wg *sync.WaitGroup) {
    defer wg.Done()
    // Do work
}
 
func main() {
    var wg sync.WaitGroup
    wg.Add(5)
 
    for i := 0; i < 5; i++ {
        go worker(i, &wg) // Pass pointer
    }
 
    wg.Wait()
}

Pattern 2: Inline Goroutine with Closure

func main() {
    var wg sync.WaitGroup
 
    for i := 0; i < 5; i++ {
        wg.Add(1)
 
        go func(id int) {
            defer wg.Done()
            fmt.Println("Worker", id)
        }(i) // Pass i as argument to avoid closure issues
    }
 
    wg.Wait()
}

❌ Common Mistake: Capturing Loop Variable

// WRONG: All goroutines see the same 'i'
for i := 0; i < 5; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        fmt.Println(i) // Race condition! Captures loop variable
    }()
}
 
// CORRECT: Pass i as argument
for i := 0; i < 5; i++ {
    wg.Add(1)
    go func(id int) {
        defer wg.Done()
        fmt.Println(id)
    }(i) // Copy value
}

Concurrency vs Parallelism

Concurrency: Dealing with multiple things at once (design) Parallelism: Doing multiple things at once (execution)

Concurrency (Not Parallel)

// Single CPU core, time-slicing between goroutines
GOMAXPROCS = 1
 
Timeline:
T1: [G1][G2][G1][G3][G2][G1][G3]
    └── Interleaved execution (concurrent, not parallel)

Parallelism

// Multiple CPU cores, truly simultaneous execution
GOMAXPROCS = 4
 
Timeline:
Core 1: [G1][G1][G1][G1]
Core 2: [G2][G2][G2][G2]
Core 3: [G3][G3][G3][G3]
Core 4: [G4][G4][G4][G4]
    └── Parallel execution

Rob Pike's Quote:

"Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once."

CPU-Bound vs I/O-Bound

CPU-Bound Tasks: Heavy computation (limited by CPU)

func fibonacci(n int) int {
    if n <= 1 {
        return n
    }
    return fibonacci(n-1) + fibonacci(n-2)
}
 
// Benefit from GOMAXPROCS = NumCPU()

I/O-Bound Tasks: Waiting for network, disk, etc. (limited by I/O)

func fetchURL(url string) {
    resp, _ := http.Get(url) // Blocks on network
    defer resp.Body.Close()
    // Process response
}
 
// Can spawn many more goroutines than CPUs
// Go scheduler handles blocking efficiently

Example: I/O-Bound Concurrency

package main
 
import (
    "fmt"
    "io"
    "net/http"
    "sync"
    "time"
)
 
func fetchURL(url string, wg *sync.WaitGroup) {
    defer wg.Done()
 
    start := time.Now()
    resp, err := http.Get(url)
    if err != nil {
        fmt.Printf("Error fetching %s: %v\n", url, err)
        return
    }
    defer resp.Body.Close()
 
    // Read response body
    bytes, _ := io.ReadAll(resp.Body)
 
    fmt.Printf("Fetched %s (%d bytes) in %v\n",
        url, len(bytes), time.Since(start))
}
 
func main() {
    urls := []string{
        "https://golang.org",
        "https://github.com",
        "https://stackoverflow.com",
    }
 
    var wg sync.WaitGroup
    start := time.Now()
 
    for _, url := range urls {
        wg.Add(1)
        go fetchURL(url, &wg)
    }
 
    wg.Wait()
    fmt.Printf("Total time: %v\n", time.Since(start))
}

Sequential vs Concurrent:

Sequential: 3 seconds (1s + 1s + 1s)
Concurrent: ~1 second (max of all three)

Race Conditions: The Enemy of Concurrency

A race condition occurs when multiple goroutines access shared data concurrently, and at least one modifies it.

Example: Counter Race Condition

package main
 
import (
    "fmt"
    "sync"
)
 
var counter int // Shared variable
 
func increment(wg *sync.WaitGroup) {
    defer wg.Done()
    counter++ // NOT thread-safe!
}
 
func main() {
    var wg sync.WaitGroup
 
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go increment(&wg)
    }
 
    wg.Wait()
    fmt.Println("Counter:", counter) // Expected: 1000, Actual: ???
}

Output (varies):

Counter: 987   // Lost updates due to race condition!

Why? The counter++ operation is NOT atomic:

1. Read counter value
2. Increment value
3. Write back to counter

Multiple goroutines can read the same value before any write, causing lost updates.

Detecting Race Conditions

Go has a built-in race detector:

go run -race main.go
go test -race ./...
go build -race

Example with Race Detector:

$ go run -race main.go
 
==================
WARNING: DATA RACE
Write at 0x00c000012090 by goroutine 7:
  main.increment()
      /path/to/main.go:10 +0x3a
 
Previous write at 0x00c000012090 by goroutine 6:
  main.increment()
      /path/to/main.go:10 +0x3a
==================
Counter: 987
Found 1 data race(s)

Always run tests with -race flag in CI/CD!

Avoiding Race Conditions

Solution 1: Mutex (Mutual Exclusion)

package main
 
import (
    "fmt"
    "sync"
)
 
var (
    counter int
    mu      sync.Mutex // Protects counter
)
 
func increment(wg *sync.WaitGroup) {
    defer wg.Done()
 
    mu.Lock()   // Acquire lock
    counter++   // Critical section
    mu.Unlock() // Release lock
}
 
func main() {
    var wg sync.WaitGroup
 
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go increment(&wg)
    }
 
    wg.Wait()
    fmt.Println("Counter:", counter) // Always 1000
}

Solution 2: Atomic Operations

package main
 
import (
    "fmt"
    "sync"
    "sync/atomic"
)
 
var counter int64 // Must be int32 or int64 for atomic ops
 
func increment(wg *sync.WaitGroup) {
    defer wg.Done()
    atomic.AddInt64(&counter, 1) // Atomic increment
}
 
func main() {
    var wg sync.WaitGroup
 
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go increment(&wg)
    }
 
    wg.Wait()
    fmt.Println("Counter:", atomic.LoadInt64(&counter)) // Always 1000
}

When to use each:

Mutex: Protecting complex critical sections
Atomic: Simple operations (increment, load, store)

package main
 
import "fmt"
 
func counter(ch chan int) {
    count := 0
    for range ch {
        count++
    }
    fmt.Println("Counter:", count)
}
 
func main() {
    ch := make(chan int)
 
    go counter(ch)
 
    for i := 0; i < 1000; i++ {
        ch <- 1
    }
 
    close(ch)
}

Go proverb: "Don't communicate by sharing memory; share memory by communicating."

We'll explore channels in detail in the next post!

Goroutine Leaks: A Common Pitfall

A goroutine leak occurs when goroutines are started but never terminate.

Example: Goroutine Leak

package main
 
import (
    "fmt"
    "time"
)
 
func leakyWorker() {
    for {
        // Infinite loop, goroutine never exits!
        time.Sleep(1 * time.Second)
    }
}
 
func main() {
    for i := 0; i < 10; i++ {
        go leakyWorker() // 10 goroutines that never stop
    }
 
    time.Sleep(5 * time.Second)
    fmt.Println("Main exiting") // Goroutines leak
}

Problem: 10 goroutines run forever, consuming resources.

Fixing Goroutine Leaks with Context

package main
 
import (
    "context"
    "fmt"
    "time"
)
 
func worker(ctx context.Context, id int) {
    for {
        select {
        case <-ctx.Done():
            fmt.Printf("Worker %d: Stopping\n", id)
            return
        default:
            // Do work
            time.Sleep(500 * time.Millisecond)
        }
    }
}
 
func main() {
    ctx, cancel := context.WithCancel(context.Background())
 
    for i := 0; i < 10; i++ {
        go worker(ctx, i)
    }
 
    time.Sleep(2 * time.Second)
 
    fmt.Println("Main: Canceling workers")
    cancel() // Signal all goroutines to stop
 
    time.Sleep(1 * time.Second)
    fmt.Println("Main: Exiting")
}

Always ensure goroutines can be stopped gracefully!

Real-World Example: Concurrent Web Scraper

Let's build a practical concurrent web scraper:

package main
 
import (
    "fmt"
    "io"
    "net/http"
    "sync"
    "time"
)
 
type Result struct {
    URL    string
    Status int
    Size   int
    Err    error
}
 
func fetch(url string, results chan<- Result, wg *sync.WaitGroup) {
    defer wg.Done()
 
    start := time.Now()
    resp, err := http.Get(url)
    if err != nil {
        results <- Result{URL: url, Err: err}
        return
    }
    defer resp.Body.Close()
 
    body, err := io.ReadAll(resp.Body)
    if err != nil {
        results <- Result{URL: url, Status: resp.StatusCode, Err: err}
        return
    }
 
    results <- Result{
        URL:    url,
        Status: resp.StatusCode,
        Size:   len(body),
    }
 
    fmt.Printf("Fetched %s in %v\n", url, time.Since(start))
}
 
func main() {
    urls := []string{
        "https://golang.org",
        "https://github.com",
        "https://stackoverflow.com",
        "https://reddit.com",
        "https://news.ycombinator.com",
    }
 
    results := make(chan Result, len(urls))
    var wg sync.WaitGroup
 
    // Launch goroutines
    for _, url := range urls {
        wg.Add(1)
        go fetch(url, results, &wg)
    }
 
    // Close results channel when all done
    go func() {
        wg.Wait()
        close(results)
    }()
 
    // Collect results
    for result := range results {
        if result.Err != nil {
            fmt.Printf("❌ %s: %v\n", result.URL, result.Err)
        } else {
            fmt.Printf("✅ %s: %d (%d bytes)\n",
                result.URL, result.Status, result.Size)
        }
    }
}

Features:

Concurrent fetching with goroutines
Results collected via channel
WaitGroup ensures all goroutines complete
Graceful error handling

Best Practices for Goroutines

1. Always Know When Goroutines Will Stop

// ❌ BAD: Goroutine may leak
go func() {
    for {
        doWork()
    }
}()
 
// ✅ GOOD: Goroutine can be stopped
go func(ctx context.Context) {
    for {
        select {
        case <-ctx.Done():
            return
        default:
            doWork()
        }
    }
}(ctx)

2. Use WaitGroups for Synchronization

// ❌ BAD: Using sleep
go worker()
time.Sleep(1 * time.Second) // Unreliable!
 
// ✅ GOOD: Using WaitGroup
var wg sync.WaitGroup
wg.Add(1)
go func() {
    defer wg.Done()
    worker()
}()
wg.Wait() // Reliable

3. Avoid Capturing Loop Variables

// ❌ BAD: Race condition
for i := 0; i < 10; i++ {
    go func() {
        fmt.Println(i) // All goroutines see same 'i'
    }()
}
 
// ✅ GOOD: Pass as argument
for i := 0; i < 10; i++ {
    go func(id int) {
        fmt.Println(id)
    }(i)
}

4. Limit Goroutine Count for Resource-Intensive Tasks

// Worker pool pattern (covered in next post)
const maxWorkers = 10
semaphore := make(chan struct{}, maxWorkers)
 
for _, task := range tasks {
    semaphore <- struct{}{} // Acquire
    go func(t Task) {
        defer func() { <-semaphore }() // Release
        processTask(t)
    }(task)
}

5. Always Run Tests with Race Detector

go test -race ./...

Performance Considerations

Goroutine Creation Overhead

Creating goroutines is cheap but not free:

func BenchmarkGoroutineCreation(b *testing.B) {
    for i := 0; i < b.N; i++ {
        go func() {}()
    }
}
 
// Result: ~1-2 microseconds per goroutine

When to avoid creating too many:

Very short-lived tasks (< 1 microsecond)
Tight loops with millions of iterations
Use worker pools instead

Memory Usage

Each goroutine uses ~2KB of stack initially:

1,000 goroutines    = ~2 MB
10,000 goroutines   = ~20 MB
100,000 goroutines  = ~200 MB
1,000,000 goroutines = ~2 GB

Monitor with:

fmt.Println("Goroutines:", runtime.NumGoroutine())

Common Goroutine Patterns

Pattern 1: Fire-and-Forget

go logToFile(data) // Don't wait for completion

Pattern 2: Fan-Out (Distribute Work)

for _, task := range tasks {
    go processTask(task)
}

Pattern 3: Fan-In (Collect Results)

results := make(chan Result)
for _, url := range urls {
    go fetch(url, results)
}

Pattern 4: Worker Pool

const numWorkers = 5
jobs := make(chan Job, 100)
 
for i := 0; i < numWorkers; i++ {
    go worker(jobs)
}

We'll explore these patterns in depth in future posts!

Debugging Goroutines

1. Check Goroutine Count

package main
 
import (
    "fmt"
    "runtime"
    "time"
)
 
func main() {
    fmt.Println("Start goroutines:", runtime.NumGoroutine())
 
    for i := 0; i < 10; i++ {
        go func() {
            time.Sleep(1 * time.Second)
        }()
    }
 
    time.Sleep(100 * time.Millisecond)
    fmt.Println("Running goroutines:", runtime.NumGoroutine())
 
    time.Sleep(2 * time.Second)
    fmt.Println("End goroutines:", runtime.NumGoroutine())
}

2. Stack Traces

Get all goroutine stack traces:

import (
    "os"
    "runtime/pprof"
)
 
func dumpGoroutines() {
    pprof.Lookup("goroutine").WriteTo(os.Stdout, 1)
}

Or send SIGQUIT to running process:

kill -QUIT <pid>

Summary and Key Takeaways

✅ Goroutines are lightweight threads managed by the Go runtime
✅ Use the go keyword to launch goroutines
✅ WaitGroups synchronize goroutine completion
✅ Race conditions occur when goroutines access shared data unsafely
✅ Use -race flag to detect race conditions
✅ Mutexes and atomic operations protect shared state
✅ GOMAXPROCS controls parallelism (default = NumCPU)
✅ Concurrency ≠ Parallelism: Concurrency is design, parallelism is execution
✅ Goroutine leaks happen when goroutines never terminate
✅ Use context.Context to cancel goroutines gracefully
✅ Monitor goroutine count with runtime.NumGoroutine()

What's Next?

Now that you understand goroutines, you're ready to learn Channels - Go's way of communicating between goroutines:

Next Post: Channels and Communication (GO-9)

Topics:

Channel basics (buffered vs unbuffered)
Select statement for multiplexing
Channel patterns (fan-out, fan-in, pipelines)
Worker pools
Context for cancellation
When to use channels vs mutexes

Practice Exercises

Exercise 1: Parallel Sum

Write a program that sums an array using multiple goroutines:

func parallelSum(numbers []int, numWorkers int) int {
    // TODO: Divide array into chunks
    // TODO: Sum each chunk in a goroutine
    // TODO: Combine results
}

Exercise 2: Concurrent File Processor

Process multiple files concurrently:

func processFiles(filenames []string) []Result {
    // TODO: Read and process each file in a goroutine
    // TODO: Collect results
}

Exercise 3: Rate-Limited API Caller

Make concurrent API calls with rate limiting:

func fetchURLs(urls []string, maxConcurrent int) []Response {
    // TODO: Limit to maxConcurrent goroutines at a time
    // TODO: Fetch all URLs
}

What's Next?

Now that you understand goroutines, the next step is to learn channels - Go's primary mechanism for goroutine communication and synchronization.

Next Post: Go Channels and Communication

Go Learning Roadmap:

Go Learning Roadmap - Complete series overview
Phase 1: Go Fundamentals - Variables, types, control flow, functions
Go Channels and Communication - Goroutine communication patterns

Companion Posts:

Comparison:

Goroutines vs Java Virtual Threads

Additional Resources

Official Documentation:

Deep Dives:

Books:

"Concurrency in Go" by Katherine Cox-Buday
"Go in Action" by William Kennedy

Questions or feedback? Let me know in the comments below!

Happy concurrent programming! 🚀

Go Goroutines and Concurrency Fundamentals

What Are Goroutines?

Your First Goroutine

Goroutines vs Threads

Traditional OS Threads

Go Goroutines

How Goroutines Work: The Go Scheduler

The GMP Model

GOMAXPROCS: Controlling Parallelism

Goroutine Lifecycle

Synchronizing Goroutines with WaitGroups

Basic WaitGroup Usage

Common WaitGroup Patterns

Concurrency vs Parallelism

Concurrency (Not Parallel)

Parallelism

CPU-Bound vs I/O-Bound

Race Conditions: The Enemy of Concurrency

Example: Counter Race Condition

Detecting Race Conditions

Avoiding Race Conditions

Solution 1: Mutex (Mutual Exclusion)

Solution 2: Atomic Operations

Solution 3: Don't Share Memory (Channels)

Goroutine Leaks: A Common Pitfall

Example: Goroutine Leak

Fixing Goroutine Leaks with Context

Real-World Example: Concurrent Web Scraper

Best Practices for Goroutines

1. Always Know When Goroutines Will Stop

2. Use WaitGroups for Synchronization

3. Avoid Capturing Loop Variables

4. Limit Goroutine Count for Resource-Intensive Tasks

5. Always Run Tests with Race Detector

Performance Considerations

Goroutine Creation Overhead

Memory Usage

Common Goroutine Patterns

Pattern 1: Fire-and-Forget

Pattern 2: Fan-Out (Distribute Work)

Pattern 3: Fan-In (Collect Results)

Pattern 4: Worker Pool

Debugging Goroutines

1. Check Goroutine Count

2. Stack Traces

Summary and Key Takeaways

What's Next?

Practice Exercises

Exercise 1: Parallel Sum

Exercise 2: Concurrent File Processor

Exercise 3: Rate-Limited API Caller

What's Next?

Related Posts

Additional Resources

📬 Subscribe to Newsletter

💬 Comments