Message Queue Selection and Patterns#

Every microservice architecture eventually needs asynchronous communication. Synchronous HTTP calls between services create tight coupling, cascading failures, and latency chains. Message queues decouple producers from consumers, absorb traffic spikes, and enable event-driven workflows. The hard part is picking the right one.

Core Concepts That Apply Everywhere#

Before comparing specific systems, understand the delivery guarantees they can offer:

At-most-once: The message might be lost, but it is never delivered twice. Fast, no overhead, acceptable for metrics or logs where occasional loss is tolerable.
At-least-once: The message is guaranteed to arrive, but might arrive more than once. The consumer must handle duplicates (idempotency). This is the most common choice.
Exactly-once: The message arrives exactly once. This is extremely hard to achieve in distributed systems. Kafka offers it within its ecosystem via transactional producers and consumers, but end-to-end exactly-once across system boundaries requires idempotent consumers anyway.

Ordering matters too. Some systems guarantee order within a partition or queue. Others provide no ordering at all. If your consumers process messages out of order, you need to handle that in application logic or choose a system that preserves order.

RabbitMQ#

RabbitMQ is a traditional message broker implementing AMQP 0.9.1. Messages flow from producers to exchanges, which route them to queues based on bindings. Consumers pull from queues.

Core Model#

Producer -> Exchange -> Binding -> Queue -> Consumer

Exchange types determine routing behavior:

Direct: Routes to queues whose binding key exactly matches the message routing key. Use for point-to-point delivery.
Fanout: Routes to all bound queues regardless of routing key. Use for broadcast/pub-sub.
Topic: Routes based on pattern matching with * (one word) and # (zero or more words). Routing key order.created.us matches bindings order.created.* and order.#.
Headers: Routes based on message header attributes instead of routing key. Rarely used.

Dead Letter Exchanges#

When a message is rejected, expires, or exceeds the queue’s max length, it gets routed to a dead letter exchange (DLX). This is essential for handling poison messages that crash consumers.

# Declare the dead letter exchange and queue
rabbitmqadmin declare exchange name=dlx type=direct
rabbitmqadmin declare queue name=orders-dlq

# Bind the DLQ to the DLX
rabbitmqadmin declare binding source=dlx destination=orders-dlq routing_key=orders

# Configure the main queue to use the DLX
rabbitmqadmin declare queue name=orders \
  arguments='{"x-dead-letter-exchange":"dlx","x-dead-letter-routing-key":"orders","x-message-ttl":300000}'

When to Use RabbitMQ#

RabbitMQ fits when you need flexible routing logic, per-message acknowledgment, priority queues, or traditional work queue patterns where each message is processed by exactly one consumer. It is mature, well-documented, and has client libraries in every language. It does not scale horizontally as well as Kafka – adding nodes requires clustering and you must manage queue mirroring or quorum queues for high availability.

Apache Kafka#

Kafka is a distributed log, not a traditional message queue. Producers append messages to the end of a topic’s partition log. Consumers read from a specific offset in the log. Messages are retained based on time or size, not consumed-and-deleted.

Core Model#

Producer -> Topic -> Partition 0 [msg0, msg1, msg2, ...]
                  -> Partition 1 [msg0, msg1, msg2, ...]
                  -> Partition 2 [msg0, msg1, msg2, ...]
                                         ^
                                   Consumer Group reads
                                   from assigned partitions

Topics are divided into partitions. Each partition is an ordered, immutable sequence of messages. Ordering is guaranteed within a partition only. A consumer group distributes partitions among its members – each partition is consumed by exactly one consumer in the group.

Partition Key Strategy#

The partition key determines which partition a message lands in. All messages with the same key go to the same partition, preserving order for that key.

# Order events partitioned by customer_id
# All events for customer "C-1234" go to the same partition
# This guarantees order per customer across created/updated/cancelled events

If you do not set a key, Kafka round-robins across partitions. You lose ordering guarantees but get even distribution. Choose your key based on what entity needs ordered processing. For order events, partition by order ID or customer ID. For user activity, partition by user ID.

Consumer Group Scaling#

To scale consumption, add consumers to the group – up to the number of partitions. If you have 12 partitions and 6 consumers, each consumer handles 2 partitions. If you add a 13th consumer, it sits idle. Plan your partition count based on your maximum expected consumer parallelism.

# Create a topic with 12 partitions and replication factor 3
kafka-topics.sh --create --topic orders \
  --partitions 12 --replication-factor 3 \
  --bootstrap-server kafka:9092

# Check consumer group lag
kafka-consumer-groups.sh --describe --group order-processor \
  --bootstrap-server kafka:9092

When to Use Kafka#

Kafka excels at high-throughput event streaming, event sourcing, log aggregation, and scenarios where consumers need to replay historical messages. It handles millions of messages per second across a cluster. The trade-off is operational complexity – you manage brokers, ZooKeeper (or KRaft), partition rebalancing, and schema evolution. Kafka is overkill for simple task queues with low throughput.

NATS#

NATS is a lightweight, high-performance messaging system. Core NATS is fire-and-forget pub/sub with no persistence. JetStream adds persistence, at-least-once/exactly-once delivery, and stream-based consumption.

Core NATS#

Messages are published to subjects – dot-separated hierarchical strings like orders.created or payments.us.processed. Subscribers use wildcards: * matches one token (orders.* matches orders.created but not orders.us.created) and > matches one or more tokens (orders.> matches both).

Core NATS has no persistence. If no subscriber is listening when a message is published, the message is lost. This is acceptable for real-time notifications, request-reply patterns, and status updates where stale data has no value.

JetStream#

JetStream adds persistence by capturing messages into streams. Consumers read from streams with acknowledged delivery.

# Create a stream capturing all order subjects
nats stream add ORDERS --subjects "orders.>" \
  --retention limits --max-msgs=-1 --max-bytes=1GB \
  --max-age 72h --storage file --replicas 3

# Create a durable consumer
nats consumer add ORDERS order-processor \
  --deliver all --ack explicit --max-deliver 5 \
  --filter "orders.created"

JetStream supports pull-based and push-based consumers, exactly-once semantics via message deduplication, and key-value and object stores built on top of streams.

When to Use NATS#

NATS fits when you need extremely low latency, simple operational footprint, and a single binary with no external dependencies. Core NATS is ideal for ephemeral messaging. JetStream covers persistent messaging needs without Kafka’s complexity. NATS works particularly well for edge computing and IoT where resource constraints matter. It is less proven than Kafka for very high throughput sustained workloads at massive scale.

Redis Streams#

Redis Streams add a log-like data structure to Redis. They support consumer groups, acknowledgment, and pending message tracking.

# Add messages to a stream
XADD orders * customer_id C-1234 total 99.50 status created
XADD orders * customer_id C-5678 total 45.00 status created

# Create a consumer group starting from the beginning
XGROUP CREATE orders order-processors 0

# Read as a consumer in the group
XREADGROUP GROUP order-processors consumer-1 COUNT 10 BLOCK 5000 STREAMS orders >

# Acknowledge processed messages
XACK orders order-processors 1234567890-0

Consumer Groups#

Redis Streams consumer groups work similarly to Kafka’s – each message is delivered to one consumer in the group. Unacknowledged messages are tracked in a pending entries list (PEL). You can claim stuck messages from failed consumers:

# Check pending messages
XPENDING orders order-processors

# Claim messages idle for more than 60 seconds
XAUTOCLAIM orders order-processors consumer-2 60000 0

When to Use Redis Streams#

Redis Streams fit when you already run Redis and need lightweight messaging without deploying a separate system. They work well for moderate throughput scenarios, real-time event processing, and as an activity stream or notification system. They do not match Kafka’s throughput at scale, and Redis’s single-threaded model limits per-node performance. Data lives in memory (with optional persistence), so storage costs are higher than disk-based systems.

Decision Matrix#

Factor	RabbitMQ	Kafka	NATS	Redis Streams
Throughput	10K-50K msg/s	1M+ msg/s	100K+ msg/s	50K-100K msg/s
Ordering	Per-queue	Per-partition	Per-subject (JetStream)	Per-stream
Replay	No (consumed = gone)	Yes (log retention)	Yes (JetStream)	Yes (stream retention)
Routing flexibility	Excellent (exchanges)	Partition key only	Subject wildcards	Stream key only
Operational complexity	Medium	High	Low	Low (if Redis exists)
Delivery guarantee	At-least-once	At-least-once / exactly-once	At-most-once / at-least-once (JetStream)	At-least-once

Choosing the Right System#

Use RabbitMQ when you need complex routing logic, priority queues, per-message TTL, or classic work queue patterns. It is the best general-purpose broker for request-response and task distribution.

Use Kafka when you need high-throughput event streaming, message replay, event sourcing, or feeding multiple independent consumers from the same data. Accept the operational cost.

Use NATS when you need minimal operational overhead, very low latency, and a lightweight binary that handles both ephemeral and persistent messaging. JetStream closes most feature gaps with Kafka for moderate-scale use cases.

Use Redis Streams when you already run Redis and your messaging needs are moderate. Adding a dedicated message broker for low-volume use cases is unnecessary complexity when Redis Streams cover the requirement.

Patterns That Apply to All Systems#

Regardless of which system you choose, implement these patterns:

Idempotent consumers: Assume messages will be delivered more than once. Use a unique message ID and track processed IDs to skip duplicates.

Dead letter handling: Route messages that fail processing repeatedly to a separate queue for inspection. Do not let poison messages block the main queue.

Backpressure: When consumers fall behind, have a strategy. Options include dropping messages, buffering to disk, or signaling producers to slow down.

Schema evolution: Messages are contracts. Use a schema registry (Confluent Schema Registry, or equivalent) or at minimum version your message formats so producers and consumers can evolve independently.