TestForge Blog
← All Posts

Outbox Pattern Guide — How to Keep Data Consistency in Event-Driven Systems

When a service must both update its database and publish an event, the dual-write problem appears quickly. This post explains why the Outbox Pattern matters, how to design the outbox table, how publisher workers operate, and how to handle retries, duplicates, and production observability.

TestForge Team ·

Why dual write is dangerous

Imagine an order service that must do two things:

  1. store the order in the database
  2. publish an OrderCreated event to Kafka

If only one of those succeeds, the system becomes inconsistent.

  • database commit succeeds but event publish fails
  • event publish succeeds but database commit fails

That is the classic dual-write problem.

Why not use distributed transactions

Distributed transactions are theoretically appealing, but in practice they often add too much coupling and complexity.

Common issues:

  • tighter dependency between systems
  • harder operations and recovery
  • poor fit across heterogeneous databases and brokers

That is why Outbox has become a practical standard in event-driven microservices.

The core idea of the Outbox Pattern

Instead of trying to commit business data and broker publish in one distributed transaction, do this:

  • write business data
  • write an outbox event row

inside the same local database transaction.

Then a separate publisher process reads the outbox table and publishes the event asynchronously.

Basic flow

Application
 -> DB Transaction
    -> orders insert
    -> outbox insert
 -> Commit

Publisher Worker
 -> read unpublished outbox rows
 -> publish to broker
 -> mark as published

The key benefit is that events are not silently lost between business write and publish.

A practical outbox table

Typical columns:

  • id
  • aggregate_type
  • aggregate_id
  • event_type
  • payload
  • status
  • created_at
  • published_at
  • retry_count

For example:

CREATE TABLE outbox_events (
  id BIGSERIAL PRIMARY KEY,
  aggregate_type VARCHAR(100) NOT NULL,
  aggregate_id VARCHAR(100) NOT NULL,
  event_type VARCHAR(100) NOT NULL,
  payload JSONB NOT NULL,
  status VARCHAR(20) NOT NULL DEFAULT 'PENDING',
  retry_count INT NOT NULL DEFAULT 0,
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  published_at TIMESTAMP
);

How should publishing work

Two broad approaches are common.

Polling publisher

  • periodically fetch PENDING rows
  • publish to the broker
  • mark successful rows as PUBLISHED

Pros:

  • simple and easy to build

Cons:

  • introduces small delay
  • requires polling management

CDC-based publishing

  • use tools like Debezium
  • stream outbox changes from the database log

Pros:

  • low latency
  • highly automated

Cons:

  • higher operational complexity

Many teams start with polling and consider CDC later at larger scale.

Duplicate delivery must be expected

Outbox helps prevent event loss, but it does not magically give you exactly-once semantics.

A common scenario:

  • publish succeeds
  • worker crashes before updating outbox status

On retry, the event may be published again.

That means consumers must be designed for idempotency:

  • use idempotency keys
  • tolerate duplicate messages
  • track already processed event ids when necessary

Operational metrics matter

Outbox is not only an application pattern. It is also an operational pipeline.

Important signals:

  • backlog of PENDING events
  • retry growth
  • per-event-type publish failures
  • consumer-side delay

If you do not observe those, the outbox can fail silently even when the pattern is “implemented.”

Common mistakes

  • creating the outbox but not monitoring it
  • no retry policy
  • no schema versioning in payload
  • assuming consumers will never see duplicates
  • no replay plan after publisher failure

When Outbox is especially effective

  • order, payment, shipping, and lifecycle events
  • systems where the database remains the source of truth
  • domains where event loss is unacceptable

If ultra-low latency is critical and your team can handle the complexity, CDC-based designs may be worth the extra effort.

Closing thoughts

The Outbox Pattern is one of the most practical ways to manage data consistency in event-driven systems without relying on fragile distributed transactions.

Its real value comes not only from recording the event, but from building the full operating model around it:

  • retries
  • duplicate handling
  • backlog monitoring
  • replay strategy

That is what turns Outbox from a diagram pattern into a production-safe architecture.