{"id":2368,"date":"2026-06-25T20:12:52","date_gmt":"2026-06-25T19:12:52","guid":{"rendered":"https:\/\/denbeke.be\/blog\/?p=2368"},"modified":"2026-06-25T20:13:58","modified_gmt":"2026-06-25T19:13:58","slug":"can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark","status":"publish","type":"post","link":"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/","title":{"rendered":"Can DCB event sourcing be fast and flexible? A Postgres benchmark"},"content":{"rendered":"\n<p>Classic event sourcing has a rule: one aggregate, one stream, one consistency boundary. It is simple and it scales, but it is rigid. The moment you need an invariant that spans two aggregates, you are writing sagas and workarounds.<\/p>\n\n\n\n<p>Dynamic Consistency Boundaries (DCB) are meant to remove that constraint. The recurring objection is that they are slow. So I built a small DCB framework in Go on top of Postgres and benchmarked the append path. All concurrency is handled in the database, not the application, so the Go side is stateless: you can run many instances in parallel and still get full consistency. And it is plain Postgres, so anyone can reproduce it.<\/p>\n\n\n\n<h2>What is wrong with classic event sourcing?<\/h2>\n\n\n\n<p>The aggregate is your consistency boundary. Strong consistency exists only inside one aggregate&#8217;s stream, enforced with a version number and optimistic locking. You get small transactions, no distributed locks, and clean concurrency control.<\/p>\n\n\n\n<p>The cost is that the boundary is fixed at design time. You decide it while modelling your events: which event belongs to which stream. If two things must change together they have to share an aggregate, and a rule that spans separate aggregates cannot be enforced in one transaction. Getting it wrong means re-streaming events later, a migration you do not want to run on a live system.<\/p>\n\n\n\n<h2>What is DCB then?<\/h2>\n\n\n\n<p>DCB makes consistency a condition defined per command, as a query over the events, instead of a fixed boundary. You read the events your decision needs, record the position you read up to, and on append the store checks that no new matching events arrived in between. Same optimistic locking, dynamic boundary.<\/p>\n\n\n\n<p>Events carry tags, and a boundary is a query over those tags. Boundaries therefore do not need to be known up front: a new requirement is a new query over tags that already exist, with no re-streaming. The one design-time commitment that remains is the tags themselves. Forget to tag something you later need, and you have to backfill it.<\/p>\n\n\n\n<p>For the theory and the &#8220;Killing the Aggregate&#8221; talk behind it, see&nbsp;<a href=\"https:\/\/dcb.events\/\">dcb.events<\/a>.<\/p>\n\n\n\n<h2>The setup<\/h2>\n\n\n\n<p>The store is&nbsp;<strong>Scaleway Managed Database for PostgreSQL<\/strong>. I did not self-host Postgres on a VM; these are managed database nodes, so the numbers reflect the standard managed offering out of the box. Client to database round-trip was about 11 ms.<\/p>\n\n\n\n<ul><li>A&nbsp;<strong>10,000,000-event<\/strong>&nbsp;store.<\/li><li><strong>64 concurrent writers<\/strong>&nbsp;by default, plus a sweep over the number of users.<\/li><li>A real DCB append with&nbsp;<code>condition=check<\/code>: the in-lock condition check runs.<\/li><li><strong>Overlap<\/strong>&nbsp;= the fraction of writes aimed at one shared boundary. 0% means every writer hits its own entity, 100% means all hit the same one.<\/li><\/ul>\n\n\n\n<p>Two managed-database node types appear in the charts:<\/p>\n\n\n\n<ul><li><strong>Small node<\/strong>: PLAY2-PICO, 1 vCPU, 2 GB RAM. About&nbsp;<strong>\u20ac10\/month<\/strong>.<\/li><li><strong>Big node<\/strong>: general-purpose, 32 vCPU, 128 GB RAM. About&nbsp;<strong>\u20ac1,580\/month<\/strong>&nbsp;(\u20ac2.17\/hour).<\/li><\/ul>\n\n\n\n<p>The small node does about 2,800 appends\/s with staging; the big node reaches about 5,000\/s. Roughly 150x the price for under 2x the throughput. The ceiling is the database write path, not the core count.<\/p>\n\n\n\n<p>One caveat before the numbers: absolute throughput varied about&nbsp;<strong>3x between hosts<\/strong>&nbsp;at identical spec, purely from commit fsync latency. Benchmark your own target and trust ratios over absolute values.<\/p>\n\n\n\n<h2>Four locking strategies I compared<\/h2>\n\n\n\n<ul><li><strong>Global lock.<\/strong>&nbsp;One advisory lock, every append serializes. Consistent, plain&nbsp;<code>uint64<\/code>&nbsp;cursor, simple. Throughput-capped.<\/li><li><strong>Per-boundary locks.<\/strong>&nbsp;A lock per&nbsp;<code>(type, tag, value)<\/code>, so different entities append in parallel. Fast, but positions commit out of order, so it needs opaque&nbsp;<code>xid8<\/code>&nbsp;cursors.<\/li><li><strong>Hybrid.<\/strong>&nbsp;Per-boundary locks for the check, global lock for the insert. Gap-free, but lands at global speed.<\/li><li><strong>Staging + flusher.<\/strong>&nbsp;Appends land concurrently in a&nbsp;<code>staging<\/code>&nbsp;buffer; a single flusher moves them to&nbsp;<code>events<\/code>&nbsp;in order.<\/li><\/ul>\n\n\n\n<p>There is one design rule that makes all of this lockable: an append condition must be scoped to a boundary. It needs at least one event-type-plus-tag combination, for example&nbsp;<code>UserCreated<\/code>&nbsp;tagged&nbsp;<code>user:123<\/code>, which becomes the lock key. A bare type-level condition like &#8220;no&nbsp;<code>UserCreated<\/code>&nbsp;exists anywhere&#8221; has no boundary to lock on, so I disallow it. It would force a global lock over the whole type and defeat the per-boundary concurrency the other strategies rely on.<\/p>\n\n\n\n<h2>The results<\/h2>\n\n\n\n<p>The global lock is flat across overlap (about 1,250 to 1,580 appends\/s) and does not scale with cores. It serializes everything. That is the baseline.<\/p>\n\n\n\n<p>Per-boundary runs&nbsp;<strong>2x to 5x faster<\/strong>&nbsp;at 0% overlap, and the gap grows with cores and concurrency, up to about&nbsp;<strong>6,900 appends\/s<\/strong>&nbsp;at 512 users on the big node. At 100% overlap it decays back to global, because all keys collapse into one.<\/p>\n\n\n\n<p>The hybrid lands at roughly global speed. It gives up parallel commits to stay gap-free, which is the whole reason per-boundary was fast. Skip it.<\/p>\n\n\n\n<p>Four-way comparison on the small node, condition check on:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"695\" src=\"https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server-1024x695.png\" alt=\"\" class=\"wp-image-2369\" srcset=\"https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server-1024x695.png 1024w, https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server-300x204.png 300w, https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server-768x521.png 768w, https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server.png 1462w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Throughput vs clients on the big node (bars staging, line boundary):<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"713\" src=\"https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-big-database-server-multiple-users-1024x713.png\" alt=\"\" class=\"wp-image-2370\" srcset=\"https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-big-database-server-multiple-users-1024x713.png 1024w, https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-big-database-server-multiple-users-300x209.png 300w, https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-big-database-server-multiple-users-768x535.png 768w, https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-big-database-server-multiple-users.png 1456w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>The ceiling is the database write path, not the lock: WAL plus hot-page index contention. 5K versus 15K IOPS made no difference, and disabling fsync did not raise the per-boundary ceiling.<\/p>\n\n\n\n<h2>The key insight<\/h2>\n\n\n\n<p>The problem with per-boundary locks is&nbsp;<em>when<\/em>&nbsp;the sequence number is assigned. It is allocated at insert time, before the transaction commits, and concurrent transactions commit in a different order than they took their numbers. So transaction B can grab sequence 101 and commit while transaction A still holds 100 uncommitted. A consumer tailing&nbsp;<code>events<\/code>&nbsp;by sequence number sees 101 become visible, advances its cursor past 100, and when A finally commits it has already skipped event 100. That is the gap, and it is why a plain monotonic cursor is no longer safe.<\/p>\n\n\n\n<p>This is also why per-boundary is fast: the speed comes from many appends fsync-ing concurrently (group commit), and that concurrency is exactly what lets commit order diverge from assignment order. The hybrid serializes the commit to keep the sequence gap-free, and in doing so throws the speed away.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>The per-boundary throughput is the parallel commits, and the parallel commits are the gap.<\/p><\/blockquote>\n\n\n\n<p>So the apparent choice is: consistency and a plain cursor (global, hybrid), or throughput (per-boundary, with opaque&nbsp;<code>xid8<\/code>cursors). Unless you stop serializing the write and serialize only the sequence assignment, which is what staging does.<\/p>\n\n\n\n<h2>Can we have all four properties?<\/h2>\n\n\n\n<p>That is the staging strategy. Appends land concurrently in a small&nbsp;<code>staging<\/code>&nbsp;buffer; a single flusher moves them into&nbsp;<code>events<\/code>&nbsp;in order. One writer into&nbsp;<code>events<\/code>&nbsp;keeps&nbsp;<code>sequence_number<\/code>&nbsp;gap-free and monotonic, so consumers keep a plain&nbsp;<code>uint64<\/code>&nbsp;cursor. The condition check reads&nbsp;<code>staging<\/code>&nbsp;and&nbsp;<code>events<\/code>&nbsp;together, so it stays consistent across multiple boundaries.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th scope=\"col\">property<\/th><th scope=\"col\">global<\/th><th scope=\"col\">boundary<\/th><th scope=\"col\">hybrid<\/th><th scope=\"col\">staging<\/th><\/tr><\/thead><tbody><tr><td>multi-boundary DCB<\/td><td>\u2713<\/td><td>\u2713<\/td><td>\u2713<\/td><td>\u2713<\/td><\/tr><tr><td>gap-free consistency<\/td><td>\u2713<\/td><td><strong>\u2717<\/strong><\/td><td>\u2713<\/td><td>\u2713<\/td><\/tr><tr><td>plain&nbsp;<code>uint64<\/code>&nbsp;cursor<\/td><td>\u2713<\/td><td>\u2713<\/td><td>\u2713<\/td><td>\u2713<\/td><\/tr><tr><td>per-boundary concurrency<\/td><td><strong>\u2717<\/strong><\/td><td>\u2713<\/td><td><strong>\u2717<\/strong><\/td><td>\u2713<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>It is also fast: about&nbsp;<strong>2.8x global<\/strong>&nbsp;on the small node, around&nbsp;<strong>5,000 appends\/s<\/strong>&nbsp;on the big node, and at high client counts it beats per-boundary (5,045 vs 4,165 at 768 clients), because each append hits the small staging buffer instead of the 10M-row&nbsp;<code>events<\/code>&nbsp;indexes. The flusher pays the big-index cost once per batch.<\/p>\n\n\n\n<p>On a&nbsp;<strong>100M-event<\/strong>&nbsp;table the numbers barely moved (staging peaked at 6,238\/s at 256 clients) and the single flusher held the backlog at zero. The condition check short-circuits on the primary key regardless of history depth.<\/p>\n\n\n\n<h2>But it is not free<\/h2>\n\n\n\n<ul><li><strong>Eventual read-visibility, which is normal event sourcing.<\/strong>&nbsp;An append lands in&nbsp;<code>staging<\/code>&nbsp;and appears in&nbsp;<code>events<\/code>&nbsp;on the next flush, tens of milliseconds later. This is the eventual consistency you already have in CQRS: projections and read models always lag the write side. Staging adds a small constant to a lag you already design around. Strong-consistency decisions are unaffected, since the check reads&nbsp;<code>staging<\/code>&nbsp;and&nbsp;<code>events<\/code>&nbsp;together.<\/li><li><strong>A flusher daemon.<\/strong>&nbsp;A single, always-on, leader-elected process. If it stops,&nbsp;<code>staging<\/code>&nbsp;fills.<\/li><li><strong>About 2x write amplification.<\/strong>&nbsp;Every event is written twice. Fine for small and medium events; for large write-bound events the edge returns to per-boundary.<\/li><li><strong>A conditional ceiling on hot boundaries.<\/strong>&nbsp;A conditional append to a boundary with a pending staging row conflicts until it flushes, so roughly one conditional append per flush cycle per hot boundary. A single hammered entity gave a 42% self-conflict rate. Fine for human-paced boundaries, not for a single hot counter.<\/li><\/ul>\n\n\n\n<p>Durability holds:&nbsp;<code>staging<\/code>&nbsp;is logged, so a successful append is on disk before it returns. A crash before the flush leaves the event in staging, flushed on restart.<\/p>\n\n\n\n<h2>What about the read side?<\/h2>\n\n\n\n<p>It is out of scope by design. In event sourcing the read side is decoupled from the write side: projections are separate consumers that tail the ordered event log at their own pace and scale independently. Staging changes nothing there, since consumers tail&nbsp;<code>events<\/code>&nbsp;by the same plain cursor. The only read on the write path is the decision read, which this benchmark exercises. The read side is a separate, well-understood concern, not a gap in these numbers.<\/p>\n\n\n\n<h2>So, is DCB fast enough to stay flexible?<\/h2>\n\n\n\n<p>Yes, with limits. For spread, human-paced workloads, which is most business event sourcing, you get flexible runtime constraints, single-digit-thousands of appends per second, and full consistency with a plain integer cursor, on managed Postgres. A single hot boundary still serializes everyone, &#8220;fast enough&#8221; depends on your workload, and staging adds a flusher daemon. But the claim that DCB cannot be fast on commodity infrastructure does not hold up.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Classic event sourcing has a rule: one aggregate, one stream, one consistency boundary. It is simple and it scales, but it is rigid. The moment you need an invariant that spans two aggregates, you are writing sagas and workarounds. Dynamic Consistency Boundaries (DCB) are meant to remove that constraint. The recurring objection is that they [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[235,161],"tags":[276,226,294,291,231,232,292],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v15.6.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Can DCB event sourcing be fast and flexible? A Postgres benchmark &ndash; DenBeke<\/title>\n<meta name=\"description\" content=\"Benchmarking Dynamic Consistency Boundaries (DCB) on Postgres: how to keep event sourcing flexible and still hit thousands of appends per second.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Can DCB event sourcing be fast and flexible? A Postgres benchmark &ndash; DenBeke\" \/>\n<meta property=\"og:description\" content=\"Benchmarking Dynamic Consistency Boundaries (DCB) on Postgres: how to keep event sourcing flexible and still hit thousands of appends per second.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/\" \/>\n<meta property=\"og:site_name\" content=\"DenBeke\" \/>\n<meta property=\"article:published_time\" content=\"2026-06-25T19:12:52+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-06-25T19:13:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server-1024x695.png\" \/>\n<meta name=\"twitter:card\" content=\"summary\" \/>\n<meta name=\"twitter:creator\" content=\"@MthsBk\" \/>\n<meta name=\"twitter:site\" content=\"@MthsBk\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\">\n\t<meta name=\"twitter:data1\" content=\"8 minutes\">\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/denbeke.be\/blog\/#website\",\"url\":\"https:\/\/denbeke.be\/blog\/\",\"name\":\"DenBeke\",\"description\":\"Mathias Beke\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/denbeke.be\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/denbeke.be\/blog\/wp-content\/uploads\/2026\/06\/DCB-benchmark-small-postgres-database-server.png\",\"width\":1462,\"height\":992},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/#webpage\",\"url\":\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/\",\"name\":\"Can DCB event sourcing be fast and flexible? A Postgres benchmark &ndash; DenBeke\",\"isPartOf\":{\"@id\":\"https:\/\/denbeke.be\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/#primaryimage\"},\"datePublished\":\"2026-06-25T19:12:52+00:00\",\"dateModified\":\"2026-06-25T19:13:58+00:00\",\"author\":{\"@id\":\"https:\/\/denbeke.be\/blog\/#\/schema\/person\/386878f712fe3fe22227216f087772dc\"},\"description\":\"Benchmarking Dynamic Consistency Boundaries (DCB) on Postgres: how to keep event sourcing flexible and still hit thousands of appends per second.\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/denbeke.be\/blog\/software\/can-dcb-event-sourcing-be-fast-and-flexible-a-postgres-benchmark\/\"]}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/denbeke.be\/blog\/#\/schema\/person\/386878f712fe3fe22227216f087772dc\",\"name\":\"Mathias Beke\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/denbeke.be\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/015ba35e6ce4f5859e3888ca99807575?s=96&d=mm&r=g\",\"caption\":\"Mathias Beke\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/posts\/2368"}],"collection":[{"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/comments?post=2368"}],"version-history":[{"count":1,"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/posts\/2368\/revisions"}],"predecessor-version":[{"id":2371,"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/posts\/2368\/revisions\/2371"}],"wp:attachment":[{"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/media?parent=2368"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/categories?post=2368"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/denbeke.be\/blog\/wp-json\/wp\/v2\/tags?post=2368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}