Event-Driven Architecture: Messages, Events & Async Communication

In traditional request/response architectures, components call each other directly and wait for a reply. Service A calls Service B, which calls Service C, and the entire chain must be available right now. This creates tight coupling, cascading failures, and systems that break under load.

Event-driven architecture flips this model. Instead of telling other components what to do, a component announces what happened. Other components react to these announcements independently, at their own pace, without the producer needing to know they exist.

This decoupling is what makes event-driven systems resilient, scalable, and extensible — but it also introduces challenges around eventual consistency, error handling, and debugging that require careful design.

In this post, we'll cover:

✅ Events vs commands vs queries — the fundamental building blocks
✅ Event notification vs event-carried state transfer
✅ Message brokers: Kafka, RabbitMQ, Redis Streams
✅ Pub/sub vs point-to-point messaging
✅ Event choreography vs orchestration
✅ Eventual consistency and compensation
✅ Idempotency and exactly-once processing
✅ Event schema evolution and versioning
✅ Dead letter queues and error handling
✅ Event storming as a design technique
✅ Practical implementation with Kafka and Spring Boot

Events vs Commands vs Queries

Before diving into architecture, let's clarify three fundamental message types. Getting this distinction right is critical — mixing them up leads to tangled, confusing systems.

Events: "Something Happened"

An event is a notification that something occurred in the past. It's immutable, factual, and named in past tense.

OrderPlaced        ← An order was placed
PaymentProcessed   ← A payment was processed
UserRegistered     ← A user registered
InventoryDepleted  ← Inventory reached zero

Key properties of events:

Immutable — you can't change the past; events are facts
Past tense — named after what already happened
Producer doesn't care who's listening — zero, one, or fifty consumers
No expectation of response — fire and forget

Commands: "Do This"

A command is a request for someone to do something. It's directed at a specific receiver, named in imperative form, and expects a result.

PlaceOrder         ← "Hey Order Service, place this order"
ProcessPayment     ← "Hey Payment Service, charge this card"
SendEmail          ← "Hey Email Service, send this email"
ReserveInventory   ← "Hey Inventory Service, hold these items"

Key properties of commands:

Directed — sent to a specific handler
Imperative — named as instructions
Can be rejected — the handler may refuse
Exactly one handler — not broadcast to multiple consumers

Queries: "Tell Me"

A query is a request for information. It doesn't change state — it's a pure read.

GetOrderDetails    ← "What are the details of order 123?"
ListUserOrders     ← "What orders does this user have?"
CheckInventory     ← "How many widgets are in stock?"

Why This Matters

The confusion happens when developers use events like commands or vice versa:

When you publish PleaseProcessPayment, you've created a command disguised as an event. The Order Service is still telling the Payment Service what to do — just indirectly. When you publish OrderPlaced, the Order Service simply states a fact. The Payment Service decides on its own to react by processing a payment.

This distinction drives loose coupling: the producer doesn't know or care what happens next.

Event Patterns: Notification vs State Transfer

Not all events carry the same amount of information. There are two fundamentally different approaches to what goes inside an event.

Event Notification

An event notification is thin — it tells you something happened and provides just enough information to identify the subject. If you need details, you call back.

{
  "type": "OrderPlaced",
  "orderId": "order-789",
  "timestamp": "2026-03-02T10:30:00Z"
}

The consumer receives this event and thinks: "An order was placed. Let me call the Order Service API to get the details I need."

Pros:

Events are small and fast to publish
Source of truth stays in the originating service
No risk of stale data in the event

Cons:

Creates runtime coupling — the consumer must call back to the producer
Producer must be available when the consumer processes the event
More network calls

Event-Carried State Transfer

An event with state transfer is fat — it carries all the data the consumers need, so they don't have to call back.

{
  "type": "OrderPlaced",
  "orderId": "order-789",
  "userId": "user-123",
  "items": [
    {"productId": "prod-1", "name": "Widget", "quantity": 2, "price": 29.99},
    {"productId": "prod-2", "name": "Gadget", "quantity": 1, "price": 49.99}
  ],
  "totalAmount": 109.97,
  "shippingAddress": {
    "street": "123 Main St",
    "city": "Springfield",
    "zipCode": "62704"
  },
  "timestamp": "2026-03-02T10:30:00Z"
}

Pros:

No runtime coupling — consumers are fully autonomous
Producer can be offline when consumers process the event
Consumers can build local read models (CQRS pattern)
Fewer network calls

Cons:

Larger events consume more bandwidth and storage
Data may become stale if the event was published hours ago
Schema evolution is more complex with larger payloads

Which to Choose?

Scenario	Notification	State Transfer
Consumer needs 1-2 fields	✅	Over-engineered
Consumer needs full entity	Extra call	✅
Producer uptime guaranteed	✅	✅
Producer may be offline	Risky	✅
Building local read models	❌	✅
Bandwidth constrained	✅	Heavy

Practical guideline: Start with event-carried state transfer for commands that cross service boundaries. Use thin notifications for internal domain events within a single service.

Message Brokers

Events need infrastructure to travel from producers to consumers. A message broker is the middleware that stores, routes, and delivers messages between services.

Apache Kafka

Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, durable event processing.

Key concepts:

Topics — named categories of events (like order-events, user-events)
Partitions — a topic is split into partitions for parallelism; events with the same key go to the same partition (ordering guarantee)
Consumer groups — each group gets every message exactly once; multiple groups independently process the same topic
Retention — events are stored for a configurable duration (days, weeks, or forever with compaction)
Offsets — each consumer tracks its position; can replay from any point

Kafka is best for:

High-throughput event streams (millions of events/second)
Event sourcing and replay
Building real-time data pipelines
Cases where event ordering matters

// Kafka producer — Spring Boot
@Service
public class OrderEventPublisher {
 
    private final KafkaTemplate<String, OrderEvent> kafkaTemplate;
 
    public void publishOrderPlaced(Order order) {
        OrderPlacedEvent event = new OrderPlacedEvent(
            UUID.randomUUID().toString(),  // eventId
            order.getId(),
            order.getUserId(),
            order.getItems(),
            order.getTotalAmount(),
            Instant.now()
        );
 
        // Key = orderId → all events for same order go to same partition
        kafkaTemplate.send("order-events", order.getId(), event);
    }
}
 
// Kafka consumer — Spring Boot
@Component
public class PaymentEventConsumer {
 
    @KafkaListener(topics = "order-events", groupId = "payment-service")
    public void handleOrderPlaced(OrderPlacedEvent event) {
        log.info("Processing payment for order: {}", event.getOrderId());
        paymentService.processPayment(
            event.getUserId(),
            event.getTotalAmount(),
            event.getOrderId()
        );
    }
}

RabbitMQ

RabbitMQ is a traditional message broker built on the AMQP protocol. It focuses on message routing, acknowledgment, and delivery guarantees.

Key concepts:

Exchanges — receive messages and route them to queues based on rules
Queues — store messages until consumed
Bindings — rules that connect exchanges to queues (routing keys, patterns)
Acknowledgments — consumers confirm processing; unacknowledged messages are redelivered
Exchange types — direct (exact routing key), fanout (broadcast), topic (pattern matching), headers

RabbitMQ is best for:

Task queues with complex routing requirements
Request/reply patterns
Priority queues
When you need message-level acknowledgments
Lower throughput but richer routing

Kafka vs RabbitMQ

Feature	Kafka	RabbitMQ
Model	Distributed log	Message queue
Throughput	Very high (millions/sec)	High (tens of thousands/sec)
Message retention	Configurable (days/weeks/forever)	Until consumed
Replay	Yes (re-read from any offset)	No (consumed = gone)
Ordering	Per partition	Per queue
Routing	Topic + partition key	Exchange + routing key + binding
Consumer model	Pull (consumers poll)	Push (broker delivers)
Best for	Event streaming, event sourcing	Task queues, complex routing

Redis Streams

Redis Streams is a lightweight alternative built into Redis. It's ideal for simpler event-driven patterns without the operational overhead of Kafka or RabbitMQ.

# Producer — add event to stream
XADD order-events * type OrderPlaced orderId order-789 amount 109.97
 
# Consumer — read from stream (consumer group)
XREADGROUP GROUP payment-service consumer-1 COUNT 10 BLOCK 5000 STREAMS order-events >

Redis Streams is best for:

Simple event patterns with low-to-moderate throughput
When you already use Redis for caching
Prototyping event-driven systems before committing to Kafka

Pub/Sub vs Point-to-Point

Two fundamental messaging patterns determine how messages flow from producers to consumers.

Point-to-Point (Queue)

Each message is delivered to exactly one consumer. Multiple consumers compete for messages from the same queue (competing consumers pattern).

Use cases:

Task distribution — spread work across workers
Email sending — each email sent once by one worker
Image processing — each image processed by one worker
Any job where duplicate processing is wasteful

Publish/Subscribe (Pub/Sub)

Each message is delivered to all subscribers. Every consumer group gets a copy of every message.

Use cases:

Event broadcasting — multiple systems need the same event
OrderPlaced → payment, inventory, notification, analytics all react
Real-time updates — multiple dashboards need the same data
Audit logging — every event is recorded for compliance

Kafka's Hybrid Model

Kafka elegantly combines both patterns using consumer groups:

Between groups → pub/sub — both Payment and Notification groups get every event
Within a group → point-to-point — each partition is consumed by exactly one member of the group

This means you get pub/sub (multiple services react) with built-in load balancing (multiple instances of each service share the work).

Choreography vs Orchestration

When a business process spans multiple services, you need to coordinate the steps. There are two approaches: choreography (decentralized, event-driven) and orchestration (centralized, command-driven).

Choreography: Dance Without a Director

In choreography, each service listens for events and independently decides what to do next. There's no central coordinator — services react to each other like dancers following the music.

// Choreography — each service reacts independently
 
// Order Service publishes event
@Service
public class OrderService {
    public Order createOrder(CreateOrderCommand cmd) {
        Order order = orderRepository.save(Order.create(cmd));
        eventPublisher.publish(new OrderPlacedEvent(order));
        return order;
    }
 
    @EventListener
    public void onPaymentProcessed(PaymentProcessedEvent event) {
        Order order = orderRepository.findById(event.getOrderId());
        order.confirm();
        orderRepository.save(order);
        eventPublisher.publish(new OrderConfirmedEvent(order));
    }
}
 
// Inventory Service reacts to OrderPlaced
@Component
public class InventoryEventHandler {
    @KafkaListener(topics = "order-events")
    public void onOrderPlaced(OrderPlacedEvent event) {
        inventoryService.reserve(event.getItems());
        eventPublisher.publish(new InventoryReservedEvent(event.getOrderId()));
    }
}
 
// Payment Service reacts to InventoryReserved
@Component
public class PaymentEventHandler {
    @KafkaListener(topics = "inventory-events")
    public void onInventoryReserved(InventoryReservedEvent event) {
        paymentService.charge(event.getOrderId());
        eventPublisher.publish(new PaymentProcessedEvent(event.getOrderId()));
    }
}

Pros:

Loose coupling — services don't know about each other
Easy to add new consumers without changing existing services
No single point of failure (no central coordinator)

Cons:

Hard to see the full flow — logic is scattered across services
Difficult to track where a process is stuck
Circular event chains can create infinite loops
Compensating transactions are complex

Orchestration: Follow the Conductor

In orchestration, a central orchestrator (saga coordinator) explicitly calls each service in sequence, handling the flow logic in one place.

// Orchestration — saga defines the entire flow
@Component
public class CreateOrderSaga {
 
    public void execute(CreateOrderCommand command) {
        try {
            // Step 1: Create order
            Order order = orderService.createOrder(command);
 
            // Step 2: Reserve inventory
            inventoryService.reserve(order.getItems());
 
            // Step 3: Process payment
            Payment payment = paymentService.charge(
                order.getUserId(), order.getTotalAmount()
            );
 
            // Step 4: Confirm order
            orderService.confirm(order.getId(), payment.getId());
 
            // Step 5: Send notification
            notificationService.sendOrderConfirmation(order);
 
        } catch (InventoryException e) {
            // Compensate: cancel the order
            orderService.cancel(command.getOrderId());
            throw e;
 
        } catch (PaymentException e) {
            // Compensate: release inventory, cancel order
            inventoryService.release(command.getItems());
            orderService.cancel(command.getOrderId());
            throw e;
        }
    }
}

Pros:

Full visibility — the entire flow is in one place
Easy to add error handling and compensation
Easy to track the current state of a process
Straightforward testing

Cons:

Central coordinator is a potential single point of failure
Tighter coupling — the orchestrator knows about all services
Can become a "God object" as the flow grows

When to Use Which

Factor	Choreography	Orchestration
Number of steps	2-3 steps	4+ steps
Error handling complexity	Simple	Complex
Need for process visibility	Low	High
Team autonomy	Services evolve independently	Orchestrator team coordinates
Adding new steps	Easy (add new consumer)	Requires orchestrator change
Debugging	Hard (trace events across services)	Easy (check orchestrator state)

Practical guideline: Use choreography for simple, loosely-coupled event flows (e.g., notifications, analytics). Use orchestration for business-critical multi-step processes where visibility and error handling matter (e.g., order fulfillment, payment workflows).

Eventual Consistency and Compensation

In event-driven systems, data is eventually consistent — different services may have different views of the truth at any given moment. This is a fundamental trade-off you must embrace, not fight.

Why Not Strong Consistency?

In a monolith with a single database, you get ACID transactions:

BEGIN TRANSACTION;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- Both updates succeed or both fail. Always consistent.

In a distributed system with separate databases per service, you can't do this across service boundaries. You have two options:

Distributed transactions (2PC) — coordinate a commit across databases. Doesn't scale, blocks on failures, avoided in practice.
Eventual consistency with compensation — each service commits locally, and if something goes wrong downstream, you compensate (undo) the earlier steps.

Compensation: Undoing What Was Done

When a step in a saga fails, you need to undo the previous steps. These undo operations are called compensating transactions.

Important: Compensating transactions are not always true reversals. You can't "un-send" an email or "un-charge" a credit card instantly. Compensations are semantic inverses — sending a cancellation email, issuing a refund, etc.

// Compensation example
public class CompensatingActions {
 
    // Original action                    // Compensation
    // createOrder()                  →   cancelOrder()
    // reserveInventory()             →   releaseInventory()
    // chargePayment()                →   refundPayment()
    // sendConfirmationEmail()        →   sendCancellationEmail()
    // awardLoyaltyPoints()           →   revokeLoyaltyPoints()
}

Designing for Eventual Consistency

UI patterns:

Show "Processing..." states instead of immediate confirmations
Use optimistic UI updates with background sync
Display order status as a timeline: Placed → Confirmed → Shipped

Data patterns:

Accept that different services have temporarily different views
Use timestamps and versioning to detect stale data
Design idempotent operations (next section)

Idempotency: Processing Messages Safely

In distributed systems, messages can be delivered more than once. A network timeout might cause a retry even though the first attempt succeeded. A consumer crash after processing but before acknowledging means the message is redelivered.

Idempotency means processing the same message twice produces the same result as processing it once.

Why Messages Get Delivered Multiple Times

Strategies for Idempotency

Strategy 1: Idempotency Key (Most Common)

Include a unique event ID. Track which events have already been processed.

@KafkaListener(topics = "order-events")
public void handleOrderPlaced(OrderPlacedEvent event) {
    // Check if already processed
    if (processedEventRepository.existsById(event.getEventId())) {
        log.info("Event {} already processed, skipping", event.getEventId());
        return;
    }
 
    // Process the event
    paymentService.charge(event.getUserId(), event.getTotalAmount());
 
    // Mark as processed (must be in same transaction as the side effect)
    processedEventRepository.save(new ProcessedEvent(event.getEventId()));
}

Strategy 2: Natural Idempotency

Design operations that are naturally idempotent — running them twice has the same effect as running once.

// ❌ NOT idempotent — running twice adds 100 twice
account.setBalance(account.getBalance() + 100);
 
// ✅ Idempotent — running twice has the same result
account.setBalance(500); // Set to absolute value
 
// ✅ Idempotent — conditional update
UPDATE orders SET status = 'CONFIRMED'
WHERE id = ? AND status = 'PENDING';  -- Only updates if still PENDING

Strategy 3: Database Constraints

Use unique constraints to prevent duplicate processing at the database level.

-- Unique constraint prevents double-processing
CREATE TABLE processed_payments (
    order_id VARCHAR(255) PRIMARY KEY,  -- Can't insert twice
    payment_id VARCHAR(255) NOT NULL,
    processed_at TIMESTAMP DEFAULT NOW()
);
 
-- Insert will fail on duplicate
INSERT INTO processed_payments (order_id, payment_id)
VALUES ('order-789', 'payment-456');

At-Least-Once vs At-Most-Once vs Exactly-Once

Guarantee	Meaning	Risk	When to Use
At-most-once	Deliver once, never retry	Message loss	Metrics, non-critical analytics
At-least-once	Retry until acknowledged	Duplicates	Most business events (with idempotency)
Exactly-once	Deliver exactly once	Expensive	Financial transactions (Kafka transactions)

Practical reality: Most systems use at-least-once delivery with idempotent consumers. True exactly-once is expensive and complex — Kafka supports it within Kafka transactions, but cross-system exactly-once requires careful design.

Event Schema Evolution

Events are contracts between producers and consumers. As your system evolves, event schemas change. Handling these changes without breaking consumers is critical.

The Problem

// Version 1: OrderPlaced event
{
  "orderId": "order-789",
  "amount": 109.97
}
 
// Version 2: Added currency field
{
  "orderId": "order-789",
  "amount": 109.97,
  "currency": "USD"          ← New field
}
 
// Version 3: Renamed amount → totalAmount
{
  "orderId": "order-789",
  "totalAmount": 109.97,     ← Renamed!
  "currency": "USD"
}

If old consumers expect amount but new events have totalAmount, they break.

Schema Compatibility Rules

Backward compatible (safe to deploy consumers first):

✅ Adding a new optional field
✅ Adding a new field with a default value
❌ Removing a field
❌ Renaming a field
❌ Changing a field type

Forward compatible (safe to deploy producers first):

✅ Removing an optional field
✅ Adding a new field (consumers ignore unknown fields)
❌ Adding a required field
❌ Changing a field type

Full compatible (safe in any deployment order):

✅ Adding optional fields with defaults
That's about it — full compatibility is restrictive

Schema Registry

Use a schema registry (like Confluent Schema Registry for Kafka) to enforce compatibility rules automatically.

// Avro schema evolution — backward compatible
// Version 1
{
  "type": "record",
  "name": "OrderPlacedEvent",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "amount", "type": "double"}
  ]
}
 
// Version 2 — backward compatible (new field with default)
{
  "type": "record",
  "name": "OrderPlacedEvent",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "amount", "type": "double"},
    {"name": "currency", "type": "string", "default": "USD"}
  ]
}

Practical Schema Evolution Tips

Always add, never remove — add new fields with defaults; deprecate old fields but keep them
Never rename — add a new field and populate both old and new during a transition period
Use schema registry — automated compatibility checking catches breaking changes in CI
Version your events — include a schemaVersion field so consumers can handle multiple versions

// Handle multiple event versions
public void handleOrderPlaced(JsonNode event) {
    int version = event.has("schemaVersion")
        ? event.get("schemaVersion").asInt()
        : 1;  // Default to v1 for old events
 
    switch (version) {
        case 1:
            processV1(event);  // Uses "amount"
            break;
        case 2:
            processV2(event);  // Uses "totalAmount" + "currency"
            break;
        default:
            log.warn("Unknown schema version: {}", version);
            processLatest(event);  // Best-effort
    }
}

Dead Letter Queues and Error Handling

What happens when a consumer can't process a message? Maybe the data is malformed, a dependency is down, or there's a bug in the processing logic. You can't just drop the message — and you can't retry forever.

Dead Letter Queue (DLQ)

A dead letter queue is a separate queue where messages go after they've failed processing a certain number of times.

// Spring Kafka — Dead Letter Queue configuration
@Configuration
public class KafkaConfig {
 
    @Bean
    public ConcurrentKafkaListenerContainerFactory<String, String>
            kafkaListenerContainerFactory() {
 
        ConcurrentKafkaListenerContainerFactory<String, String> factory =
            new ConcurrentKafkaListenerContainerFactory<>();
 
        // Retry 3 times with backoff
        factory.setCommonErrorHandler(new DefaultErrorHandler(
            new DeadLetterPublishingRecoverer(kafkaTemplate),
            new FixedBackOff(1000L, 3)  // 1s delay, 3 retries
        ));
 
        return factory;
    }
}
 
// Consumer — failures are automatically retried then sent to DLQ
@KafkaListener(topics = "order-events")
public void handleOrderEvent(OrderEvent event) {
    // If this throws 3 times, the message goes to "order-events.DLT"
    processOrder(event);
}
 
// DLQ consumer — for investigation and manual retry
@KafkaListener(topics = "order-events.DLT")
public void handleDeadLetter(ConsumerRecord<String, String> record) {
    log.error("Dead letter received. Key: {}, Value: {}, Error: {}",
        record.key(), record.value(),
        new String(record.headers().lastHeader("kafka_dlt-exception-message").value())
    );
    alertService.notifyOncall("Dead letter in order-events", record);
}

Error Categories and Strategies

Not all errors are the same. Your error handling strategy should distinguish between them:

Error Type	Example	Strategy
Transient	Database timeout, network blip	Retry with exponential backoff
Poison message	Malformed JSON, invalid data	Send to DLQ immediately
Business error	Insufficient funds, product unavailable	Publish failure event, trigger compensation
Bug	NullPointerException, logic error	Send to DLQ, fix code, reprocess

@KafkaListener(topics = "order-events")
public void handleOrderEvent(OrderEvent event) {
    try {
        processOrder(event);
    } catch (TransientException e) {
        // Rethrow — let the retry mechanism handle it
        throw e;
    } catch (ValidationException e) {
        // Poison message — log and skip (don't retry)
        log.error("Invalid event, skipping: {}", event, e);
        deadLetterService.send(event, e);
    } catch (InsufficientFundsException e) {
        // Business error — publish compensation event
        eventPublisher.publish(new PaymentFailedEvent(event.getOrderId(), e.getMessage()));
    }
}

Event Storming: Discovering Events

How do you figure out which events your system needs? Event storming is a collaborative workshop technique created by Alberto Brandolini that helps teams discover domain events, commands, and aggregates.

The Process

Gather the right people — domain experts, developers, product managers
Orange sticky notes — write domain events (past tense) on orange stickies and place them on a timeline
Blue sticky notes — identify the commands (actions) that trigger each event
Yellow sticky notes — identify the actors (users, systems) that issue commands
Pale yellow sticky notes — identify the aggregates (entities) that handle commands and produce events
Pink sticky notes — mark hotspots (confusion, disagreement, complexity)
Identify bounded contexts — group related events and aggregates into bounded contexts

Event Storming Output → Architecture

The bounded contexts you discover directly map to microservice boundaries. The events become the messages flowing between services. The commands become the API endpoints within each service.

Event Storming Artifact	Maps To
Bounded context	Microservice
Domain event	Message/event in broker
Command	API endpoint
Aggregate	Domain entity
Actor	User role or external system
Policy ("When X, then Y")	Event handler / consumer

Practical Implementation: Order Processing System

Let's build a complete event-driven order processing system with Spring Boot and Kafka.

Event Definitions

// Base event class
public abstract class DomainEvent {
    private final String eventId;
    private final String eventType;
    private final Instant timestamp;
    private final int schemaVersion;
 
    protected DomainEvent(String eventType) {
        this.eventId = UUID.randomUUID().toString();
        this.eventType = eventType;
        this.timestamp = Instant.now();
        this.schemaVersion = 1;
    }
}
 
// Specific events
public class OrderPlacedEvent extends DomainEvent {
    private final String orderId;
    private final String userId;
    private final List<OrderItem> items;
    private final BigDecimal totalAmount;
 
    public OrderPlacedEvent(String orderId, String userId,
                            List<OrderItem> items, BigDecimal totalAmount) {
        super("OrderPlaced");
        this.orderId = orderId;
        this.userId = userId;
        this.items = items;
        this.totalAmount = totalAmount;
    }
}
 
public class PaymentProcessedEvent extends DomainEvent {
    private final String orderId;
    private final String paymentId;
    private final BigDecimal amount;
    private final PaymentStatus status;  // SUCCESS, FAILED
 
    // constructor...
}
 
public class InventoryReservedEvent extends DomainEvent {
    private final String orderId;
    private final List<ReservedItem> reservedItems;
 
    // constructor...
}

Producer: Order Service

@Service
public class OrderService {
 
    private final OrderRepository orderRepository;
    private final KafkaTemplate<String, DomainEvent> kafkaTemplate;
 
    public Order placeOrder(PlaceOrderRequest request) {
        // 1. Validate and create order locally
        Order order = Order.create(
            request.getUserId(),
            request.getItems()
        );
        order = orderRepository.save(order);
 
        // 2. Publish event — consumers react independently
        kafkaTemplate.send(
            "order-events",
            order.getId(),  // Partition key = orderId
            new OrderPlacedEvent(
                order.getId(),
                order.getUserId(),
                order.getItems(),
                order.getTotalAmount()
            )
        );
 
        log.info("Order placed and event published: {}", order.getId());
        return order;
    }
 
    @KafkaListener(topics = "payment-events", groupId = "order-service")
    public void onPaymentResult(PaymentProcessedEvent event) {
        Order order = orderRepository.findById(event.getOrderId())
            .orElseThrow();
 
        if (event.getStatus() == PaymentStatus.SUCCESS) {
            order.confirm(event.getPaymentId());
            log.info("Order confirmed: {}", order.getId());
        } else {
            order.cancel("Payment failed: " + event.getReason());
            log.warn("Order cancelled due to payment failure: {}", order.getId());
        }
 
        orderRepository.save(order);
    }
}

Consumer: Payment Service

@Service
public class PaymentEventHandler {
 
    private final PaymentService paymentService;
    private final ProcessedEventRepository processedEvents;
    private final KafkaTemplate<String, DomainEvent> kafkaTemplate;
 
    @KafkaListener(topics = "order-events", groupId = "payment-service")
    @Transactional
    public void onOrderPlaced(OrderPlacedEvent event) {
        // Idempotency check
        if (processedEvents.existsById(event.getEventId())) {
            log.info("Event already processed: {}", event.getEventId());
            return;
        }
 
        try {
            // Process payment
            Payment payment = paymentService.charge(
                event.getUserId(),
                event.getTotalAmount()
            );
 
            // Publish success event
            kafkaTemplate.send("payment-events", event.getOrderId(),
                new PaymentProcessedEvent(
                    event.getOrderId(),
                    payment.getId(),
                    event.getTotalAmount(),
                    PaymentStatus.SUCCESS
                )
            );
        } catch (PaymentException e) {
            // Publish failure event
            kafkaTemplate.send("payment-events", event.getOrderId(),
                new PaymentProcessedEvent(
                    event.getOrderId(),
                    null,
                    event.getTotalAmount(),
                    PaymentStatus.FAILED
                )
            );
        }
 
        // Mark event as processed
        processedEvents.save(new ProcessedEvent(event.getEventId()));
    }
}

Consumer: Notification Service

@Component
public class NotificationEventHandler {
 
    private final EmailService emailService;
    private final UserClient userClient;
 
    @KafkaListener(
        topics = {"order-events", "payment-events"},
        groupId = "notification-service"
    )
    public void handleEvent(DomainEvent event) {
        switch (event.getEventType()) {
            case "OrderPlaced" -> handleOrderPlaced((OrderPlacedEvent) event);
            case "PaymentProcessed" -> handlePayment((PaymentProcessedEvent) event);
        }
    }
 
    private void handleOrderPlaced(OrderPlacedEvent event) {
        User user = userClient.getUser(event.getUserId());
        emailService.send(
            user.getEmail(),
            "Order Received",
            "Your order " + event.getOrderId() + " has been received!"
        );
    }
 
    private void handlePayment(PaymentProcessedEvent event) {
        // Only send email on success — failure handled separately
        if (event.getStatus() == PaymentStatus.SUCCESS) {
            // Fetch user info and send confirmation...
        }
    }
}

Infrastructure: Kafka Configuration

# application.yml — Spring Boot Kafka config
spring:
  kafka:
    bootstrap-servers: localhost:9092
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
      acks: all  # Wait for all replicas to acknowledge
      retries: 3
      properties:
        enable.idempotence: true  # Kafka producer idempotency
    consumer:
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
      auto-offset-reset: earliest
      enable-auto-commit: false  # Manual commit for reliability
      properties:
        spring.json.trusted.packages: "com.example.events"

Common Anti-Patterns

1. Event Soup

Publishing events for everything — including internal implementation details — creates noise that makes it impossible to understand the system.

// ❌ Event soup — too granular
UserTableRowInserted
OrderValidationStarted
InventoryDatabaseQueryExecuted
CacheInvalidated
LogFileRotated

// ✅ Meaningful domain events
UserRegistered
OrderPlaced
PaymentProcessed
InventoryDepleted

Rule: Only publish events that represent meaningful business occurrences that other services would actually care about.

2. Async Everything

Not everything should be asynchronous. If the user is waiting for a response and the operation takes 50ms, making it async adds complexity without benefit.

// ❌ Over-engineering: async for a simple lookup
User clicks "View Profile" → publishes ViewProfileRequested event
→ Profile Service consumes event → publishes ProfileLoaded event
→ API Gateway consumes event → returns to user
// Result: 500ms latency instead of 50ms, plus eventual consistency headaches
 
// ✅ Pragmatic: sync for reads, async for writes
GET /users/123 → direct synchronous call → 50ms response
POST /orders → publish OrderPlaced event → "Processing..." response

3. Temporal Coupling Through Events

Using events to create synchronous chains disguised as async:

// ❌ Sync chain disguised as events
OrderService publishes "PleaseValidateUser" event
→ waits for "UserValidated" event (blocking!)
→ then publishes "PleaseCheckInventory" event
→ waits for "InventoryChecked" event (blocking!)

If you're waiting for a response, just make a synchronous call. Events are for fire-and-forget and independent reactions.

4. No Event Versioning

Changing event schemas without versioning breaks consumers silently:

// Week 1: {orderId: "123", amount: 100}
// Week 2: {orderId: "123", totalAmount: 100}  ← Breaking change!
// Old consumers crash: "amount" field is null

Always version your schemas. Always use a schema registry in production.

Summary

Event-driven architecture decouples services by having them communicate through events rather than direct calls. This creates systems that are more resilient, scalable, and extensible — at the cost of increased complexity around consistency and debugging.

Core concepts:

Events announce what happened (past tense, immutable)
Commands request actions (imperative, directed)
Queries request information (read-only)

Event patterns:

Event notification — thin events, consumers call back for details
Event-carried state transfer — fat events, consumers are autonomous

Message brokers:

Kafka — high-throughput event streaming with replay
RabbitMQ — flexible routing with message-level acknowledgment
Redis Streams — lightweight, good for simple patterns

Coordination patterns:

Choreography — decentralized, services react independently (2-3 steps)
Orchestration — centralized saga coordinator manages the flow (4+ steps)

Reliability patterns:

Idempotency — process each event exactly once using idempotency keys
Dead letter queues — capture failed messages for investigation
Schema evolution — add fields with defaults, never remove or rename

When to use event-driven architecture:

Multiple services need to react to the same business event
You need temporal decoupling (producer/consumer don't need simultaneous availability)
You're building a system that needs to scale reads and writes independently
You need an audit trail of everything that happened

When NOT to use event-driven architecture:

Simple CRUD operations with synchronous user expectations
Small systems where the complexity isn't justified
When you don't have infrastructure for message brokers and monitoring

Events represent the language of your business. Get the events right, and the architecture follows.

What's Next in the Software Architecture Series

This is post 6 of 12 in the Software Architecture Patterns series:

✅ ARCH-1: Software Architecture Patterns Roadmap
✅ ARCH-2: Monolithic Architecture
✅ ARCH-3: Layered (N-Tier) Architecture
✅ ARCH-4: MVC, MVP & MVVM Patterns
✅ ARCH-5: Microservices Architecture
✅ ARCH-6: Event-Driven Architecture (this post)
🔜 ARCH-7: CQRS & Event Sourcing
🔜 ARCH-8: Hexagonal Architecture (Ports & Adapters)
🔜 ARCH-9: Clean Architecture
🔜 ARCH-10: Domain-Driven Design (DDD)
🔜 ARCH-11: Serverless & Function-as-a-Service
🔜 ARCH-12: Choosing the Right Architecture

Related posts:

Events vs Commands vs Queries

Events: "Something Happened"

Commands: "Do This"

Queries: "Tell Me"

Why This Matters

Event Patterns: Notification vs State Transfer

Event Notification

Event-Carried State Transfer

Which to Choose?

Message Brokers

Apache Kafka

RabbitMQ

Kafka vs RabbitMQ

Redis Streams

Pub/Sub vs Point-to-Point

Point-to-Point (Queue)

Publish/Subscribe (Pub/Sub)

Kafka's Hybrid Model

Choreography vs Orchestration

Choreography: Dance Without a Director

Orchestration: Follow the Conductor

When to Use Which

Eventual Consistency and Compensation

Why Not Strong Consistency?

Compensation: Undoing What Was Done

Designing for Eventual Consistency

Idempotency: Processing Messages Safely

Why Messages Get Delivered Multiple Times

Strategies for Idempotency

At-Least-Once vs At-Most-Once vs Exactly-Once

Event Schema Evolution

The Problem

Schema Compatibility Rules

Schema Registry

Practical Schema Evolution Tips

Dead Letter Queues and Error Handling

Dead Letter Queue (DLQ)

Error Categories and Strategies

Event Storming: Discovering Events

The Process

Event Storming Output → Architecture

Practical Implementation: Order Processing System

Event Definitions

Producer: Order Service

Consumer: Payment Service

Consumer: Notification Service

Infrastructure: Kafka Configuration

Common Anti-Patterns

1. Event Soup

2. Async Everything

3. Temporal Coupling Through Events

4. No Event Versioning

Summary

What's Next in the Software Architecture Series

📬 Subscribe to Newsletter

💬 Comments