A benchmarking harness for comparing PostgreSQL-backed job queue systems under realistic, long-horizon workloads.
The goal is a fair, reproducible, public-API-only comparison of how different queue libraries behave when you push them past warm-up — focusing on the things that show up in production: latency tail, throughput stability, table bloat, and recovery from chaos.
Eight Postgres-backed queues, same hardware, same harness. Three contracts in the lineup — event bus, job queue, visibility-timeout queue — so the throughput list isn't a single ranking. The 2026-05-09 sweep has the per-cell numbers, chaos behaviour, and bloat resistance.
Headline comparisons from that run:
- Peak clean throughput: pgque 39.9 k jobs/s in single-consumer event-bus mode; awa 14.2 k as the fastest full job queue; pgmq 11.3 k as a visibility-timeout queue before anti-scaling at higher worker counts.
- Chaos recovery: awa, pgque, and river recover from every scenario. The other five adapters either hit zero or fail to produce recovery samples in at least one chaos cell.
- Bloat / pressure: five adapters time out under at least one sustained-pressure cell; only awa, oban, and pgque complete all four pressure scenarios.
| System | Contract | Chaos recovery | Pressure cells | Notable caveat |
|---|---|---|---|---|
| awa | job queue | 5/5 | 4/4 | Full job-queue feature surface; fastest job queue in this run. |
| pgque | event/message bus | 5/5 | 4/4 | Single-consumer mode; batched success ack is a different contract. |
| river | job queue | 5/5 | 2/4 | Times out in two sustained-pressure cells. |
| oban | job queue | 4/5 | 4/4 | Handles pressure cells but has lower throughput in this run. |
| pg-boss | job queue | 3/5 | 2/4 | Postgres-level chaos exits the worker; times out in two pressure cells. |
| absurd | job queue | 3/5 | 2/4 | Shutdown timeout under pressure. |
| procrastinate | job queue | 3/5 | 2/4 | Weak repeated-kill recovery; times out in two pressure cells. |
| pgmq | visibility-timeout queue | 3/5 | 2/4 | Anti-scales past 16 workers and has the active-readers cliff. |
Throughput is one shape of the question. The other shape is what each system actually gives you. This table captures the documented feature surface — things you'd reach for in real applications. Cells reflect what's available out of the box on the default open-source distribution.
| awa | Absurd | pg-boss | pgmq | pgque | Oban | Procrastinate | River | |
|---|---|---|---|---|---|---|---|---|
| Language / runtime | Rust + Python | Python | Node.js | Postgres extension (Rust core) | Postgres extension (PL/pgSQL) | Elixir | Python | Go |
| Postgres extension required | no | no | no | yes1 | optional2 | no | no | no |
| Producer surface — bulk insert | ✓ | — | ✓ | ✓ | ✓ | ✓ | ✓ | ✓3 |
| Storage shape on hot path | append-only + receipt ring | row-mutating | row-mutating | partitioned archive | append-only + ticker | row-mutating | row-mutating | row-mutating |
| Priorities | ✓4 | — | ✓ | — | — | ✓ | ✓ | ✓ |
| Retries with backoff | ✓ | ✓ | ✓ | ✓5 | ✓ | ✓ | ✓ | ✓ |
| Cron / scheduled jobs | ✓ | — | ✓ | — | ✓6 | ✓ | ✓ | ✓ |
| Dead-letter queue | ✓7 | — | ✓8 | ✓9 | ✓ | ✓10 | ✓10 | ✓ |
| Unique jobs / dedup | ✓ | — | ✓11 | — | — | ✓ | ✓ | ✓ |
| Rate limiting per queue | ✓ | — | ✓12 | — | — | ✓13 | ✓14 | ✓ |
| Callbacks / external waits | ✓ | ✓15 | ✓16 | — | — | — | — | — |
| Web UI for ops | ✓17 | — | —18 | — | — | —19 | —20 | ✓ |
Dashes indicate "not provided as a documented feature out of the box", not "impossible". pgmq / pgque in particular are intentionally minimal — you build the worker, you choose the lifecycle. If you spot something wrong, please open a PR — corrections welcome from the maintainers of any of the systems listed.
Each system maps onto one of three application contracts.
Job queues — send a job, a worker runs it, the queue tracks retries and dead-lettering: awa, pg-boss, river, oban, absurd, procrastinate.
Visibility-timeout queue — pgmq. Send / read with timeout / ack-or-redeliver. No per-job retry counter, no scheduling, no DLQ beyond an archive table.
Event/message bus — pgque (PgQ lineage). Append-only event log,
ticker forms batch boundaries, multiple consumer groups each track
a cursor over the shared log
(upstream
calls it Kafka-shaped). pgque also runs as a single-consumer
competing-consumers queue, which is how this bench drives it: one
consumer per replica, --worker-count controls in-flight handler
concurrency within that consumer.
| System | Contract | Peak (jobs/s) | At |
|---|---|---|---|
| pgque (single-consumer mode) | event bus | 39,898 | 1×256 w |
| awa | job queue | 14,158 | 1×256 w |
| pgmq | visibility-timeout | 11,277 | 1×16 w |
| pg-boss | job queue | 2,387 | 1×64 w |
| river | job queue | 501 | 1×64 w |
| absurd | job queue | 410 | 1×128 w |
| oban | job queue | 284 | 1×64 w |
| procrastinate | job queue | 269 | flat |
pgque's number is its single-consumer mode; native fan-out across multiple consumer groups isn't exercised here. pgmq peaks at 1×16 w and anti-scales to 3.2 k at 1×256 w (audit).
In the bench's single-consumer mode, pgque competes with the job queues. Two ways it differs from awa and the other five:
- Feature surface. Default install ships retries with backoff,
per-message nack, DLQ. No priorities, no aging, no dedup, no rate
limiting, no web UI. Delayed delivery (
send_at) is insql/experimental/. - Ack granularity.
receivereturns a batch andack(batch_id)finishes the batch in one row update. Failure handling is still per-message vianack(batch_id, msg_id, retry_after, reason). A consumer that crashes mid-batch without acking redoes the whole batch on the next claim.
Whether that fits your workload is workload-specific. Analytics events that are cheap and idempotent are comfortable with batched ack. Long-running side-effecting jobs prefer the per-job ack the six job queues give you.
Earlier reference runs: 2026-05-08 awa vs pgque v2 deep-dive · 2026-05-02 alpha.3 sweep · awa under a 10-minute held writing transaction · awa extended scaling (W=256/512/1024).
Author bias: this repo is owned by the author of awa, one of the systems benchmarked. Numbers are reproducible — re-run on your hardware and check.
Chaos scenarios run inside the same bench.py harness, as named
compositions of phase types. Steady-state metrics, wait-event
histograms, and per-phase aggregates carry over; the harness also
emits jobs_lost and chaos_recovery_time_s into the recovery
phase's summary.json.
The headline picture across all eight adapters is in the 2026-05-09 sweep — Phase B (40 cells, 5 scenarios × 8 systems). Three systems recover from every chaos scenario; the other five hit zero on at least one. The per-adapter audits in the same run name the root causes.
The available chaos scenarios are documented in
docs/method.md. The cross-system chaos tracker is
#12.
- awa (Rust + Python) — 2026-05-09 sweep on
v0.6.0-alpha.9. - Absurd (Python)
- Oban (Elixir)
- pg-boss (Node.js)
- pgmq (Postgres extension; Python adapter; needs an extension-bearing image, run separately from the shared-image matrix)
- PgQue (plain SQL — no extension required; Python adapter;
pg_cronoptional, the harness runs the ticker + maint loops in-process instead) - Procrastinate (Python)
- River (Go)
- Public APIs only. Each adapter integrates the system the way a real consumer would. No reaching into internal modules, no privileged SQL.
- Subprocess contract. Adapters are language-agnostic processes that emit one JSON sample per line on stdout. Adding a new system means writing one binary that respects the contract — see CONTRIBUTING_ADAPTERS.md.
- One Postgres for everyone. All systems run against the same
postgres:18.3-alpineinstance with the samepostgres.conf— no per-system tuning advantage. (pgmq is the exception; it requires the Postgres extension and runs on a separatepg18-pgmqimage.) The compose default caps Postgres at 4 CPUs for repeatable laptop and CI runs; setPOSTGRES_CPUS=Nwhen measuring a larger machine envelope. - Long-horizon. Bloat and latency drift only show up after the first few minutes. Default scenarios run 30+ minutes.
# Init the pgque submodule (vendored at a pinned upstream SHA)
git submodule update --init --recursive
# Bring up Postgres (port 15555 by default)
docker compose up -d postgres
# Run a 5-minute smoke against one system
uv run bench run \
--systems procrastinate \
--producer-rate 200 \
--worker-count 4 \
--replicas 1 \
--phase warmup=warmup:30s \
--phase clean=clean:5mOutputs land under results/<run-id>/<system>/ as manifest.json +
summary.json + per-sample samples.ndjson. To compare runs:
uv run bench compare results/<run-id>Scenarios, phase types, and Postgres-side diagnostics (wait events,
notification queue usage, active transactions) are documented in
docs/method.md.
bench_harness/ # orchestrator, sample contract, comparison/plot
# tooling — independent of any specific SUT
tests/ # pytest suite for the harness itself
<system>-bench/ # one directory per system-under-test, each
# producing a binary that talks the JSON contract
docker-compose.yml # shared Postgres + sidecars
postgres.conf # shared tuning (work_mem, autovacuum, etc.)
bench.py # main CLI: run | combine | compare
See CONTRIBUTING_ADAPTERS.md for the JSON contract and an end-to-end walk-through.
MIT — see LICENSE.
Footnotes
-
pgmq can also be installed as SQL, but the benchmark and the common packaged distribution use the
pgmqPostgres extension. ↩ -
pgque itself is PL/pgSQL.
pg_cronis needed for the conveniencepgque.start()ticker; callers may drive the ticker themselves instead. ↩ -
River's fast bulk path uses the Postgres
COPYprotocol. ↩ -
awa priorities include aging so lower-priority work is eventually promoted. ↩
-
pgmq is a visibility-timeout queue: redelivery is controlled by the visibility timeout rather than a job-framework retry policy with counted attempts and backoff. ↩
-
pgque supports delayed visibility, but not cron-style periodic scheduling. ↩
-
awa DLQ routing is opt-in via
dlq_enabled_by_defaultor a per-queue override. ↩ -
pg-boss keeps failed/expired job history rather than exposing a separate DLQ queue abstraction. ↩
-
pgmq archives messages into queue-specific archive tables; that is retention/replay storage rather than a job-framework DLQ policy. ↩
-
Oban and Procrastinate retain exhausted failures in discarded/failed states rather than moving them to a separate queue table. ↩ ↩2
-
pg-boss deduplication is expressed through singleton keys and singleton windows. ↩
-
pg-boss rate limiting is exposed as throttling. ↩
-
Oban OSS supports local queue limits; global rate limiting is an Oban Pro feature. ↩
-
Procrastinate can limit concurrency with locks/queueing policy, but does not expose a named per-queue rate-limit primitive. ↩
-
Absurd models external waits as durable workflow steps rather than queue-level callbacks. ↩
-
pg-boss exposes job lifecycle events/subscriptions rather than durable external-wait callbacks. ↩
-
awa includes the
awa serveops UI. ↩ -
pg-boss has third-party dashboards such as
pgboss-dashboard, not an official bundled UI. ↩ -
Oban Web is part of Oban Pro. ↩
-
Procrastinate has community/third-party admin surfaces rather than a bundled official UI. ↩

