· clickhouse / postgresql / analytics

ClickHouse vs Postgres for analytics

If dashboard queries on 100M+ rows are hitting 8 seconds, ClickHouse is worth looking at. Here is what the benchmarks show about the actual break-even point.

By Ethan

1,761 words · 9 min read

If your Postgres tables are under 50 million rows, stop here — Postgres with proper indexes will serve you fine. At 100 million rows and above, ClickHouse starts winning by margins you can feel: 8× to 361× faster on analytical queries, 10× less storage on the same dataset. The operational cost is real. This article tells you where the break-even sits and what the migration looks like.

Who this is for

Engineering teams running analytics on a Postgres table that used to be fast and now isn’t. Or teams planning an event pipeline that will hit 100M+ rows within the year and want to avoid a painful migration later.

If your primary concern is transactional integrity, complex joins across a normalized schema, or a team of fewer than five with no dedicated data engineer — stay on Postgres. The verdict has the full matrix.

What ClickHouse is

ClickHouse is a column-store database built for OLAP. The key difference from Postgres: instead of storing each row as a unit, it stores each column as a unit. A SUM(revenue) query never reads user_agent or session_id — it reads only the bytes it needs.

The default storage engine is MergeTree. Data lands in columnar parts on disk and is continuously merged in the background. Deduplication, pre-aggregation, and TTL expiration all happen during background merges rather than at query time. The tradeoff: very fast reads, but writes are append-only in practice. Row-level updates are batch rewrites, not point mutations.

ClickHouse also applies LZ4 compression by default, with optional domain-specific codecs (Delta for timestamps, Gorilla for floats). The same 100M-row dataset that takes ~100 GiB in Postgres takes ~9.26 GiB in ClickHouse — roughly 10× compression, measured on the ClickBench dataset (99,997,497 rows, AWS c6a.4xlarge, ClickHouse 25.7 vs PostgreSQL 17.5).

Head-to-head

Query speed

The fiveonefour.com micro-benchmark ran a 4-query analytical workload (aggregation, statistics, sampling) on aircraft tracking data across 100 iterations. Both databases on equivalent 4 GB / 2 CPU Docker containers:

Row countClickHousePostgres + indexWinner
10K27.9 ms12.9 msPostgres (2.2×)
50K75.5 ms60.5 msRoughly tied
1M121.1 ms1,067 msClickHouse (8.8×)
10M453 ms7,595 msClickHouse (16.8×)

Postgres wins on small datasets. The crossover is around 50K–100K rows for analytical workloads. ClickHouse accelerates from there.

On the UK house price dataset (~5 million rows), the gap is starker:

QueryPostgresClickHouseSpeedup
Average price (5.01M rows)28,535 ms79 ms361×
City postcodes lookup (424K rows)543 ms77 ms
London price change over time8,904 ms62 ms144×

At 5M rows for a full-table aggregation, Postgres takes 28 seconds. ClickHouse takes 79 ms. That’s not a marginal improvement — that’s a different experience category.

A note on schema design: the fiveonefour benchmark found that a single ORDER BY reorder dropped a ClickHouse query from 4,191 ms to 75 ms on 10M rows — a 55× improvement from one schema change. The engine rewards correct schema design and punishes the defaults if you get them wrong.

Ingestion throughput

ClickHouse is built for high-throughput batch inserts. Shopee’s distributed tracing system ingests 3 million rows per second on 20 CPU cores, searching 30B+ rows in seconds. Cloudflare inserts 11 million rows per second into ClickHouse across HTTP and DNS analytics pipelines. These are real production numbers, not lab benchmarks.

Postgres can handle bulk inserts efficiently, but at scale it falls behind: every write must maintain MVCC state, WAL records, and index updates in parallel. ClickHouse writes to a columnar part on disk and defers the rest to background merges.

For append-only workloads (event streams, logs, metrics), this is excellent. For workloads needing frequent UPDATE or DELETE, ClickHouse’s batch-mutation model will cause friction.

Storage

ClickBench figures on the same 100M-row dataset:

  • ClickHouse: 9.26 GiB (LZ4 columnar)
  • PostgreSQL: ~100 GiB (row-store with default TOAST)

At 10 TB of Postgres analytics data, expect roughly 1 TB in ClickHouse. Storage costs compound quickly at scale.

JOINs and normalized schemas

This is where ClickHouse loses. Postgres handles normalized schemas and multi-table JOINs naturally. ClickHouse lacks data shuffling for large distributed JOINs — more than 3–4 JOINs can cause RAM spikes or crashes. The idiomatic ClickHouse pattern is to denormalize into wide flat tables during ETL, which means the data model looks nothing like a normalized Postgres schema.

If your analytics queries are deeply joined, you either restructure the data model (work) or absorb the performance penalty (pain). Neither is free.

Operational complexity

Postgres is one daemon. ClickHouse on a single node is manageable. Distributed ClickHouse requires ClickHouse Keeper (a Raft-based coordination layer), separate monitoring, and a different operational model.

There’s also the ecosystem gap. Postgres has 30 years of tooling — pgAdmin, ORMs, Liquibase, PostgREST, and proven ACID guarantees. ClickHouse is a second database to run, not a Postgres replacement.

Postgres is not finished

Before moving to ClickHouse, check what Postgres can do on its own:

  • Range partitioning + BRIN indexes: BRIN indexes run ~100 KB for a timestamp column. Combined with time-based partitioning, range scans skip entire month/year partitions without a full table scan.
  • TimescaleDB: Columnar compression and automatic time partitioning inside Postgres. Claims 1,000× improvement on time-series workloads versus vanilla Postgres.
  • pg_mooncake: Columnstore tables within Postgres. Placed in the ClickBench top 10 as of 2025/2026.

With TimescaleDB or pg_mooncake, Postgres can stay competitive up to ~500M rows for well-structured time-series. The break-even isn’t row count alone — it’s row count combined with query complexity, concurrency, and team capacity.

Break-even analysis

FactorStay on PostgresAdd ClickHouse
Row count< 50M> 100M
Query patternPoint lookups, recent-N rows, complex JOINsGROUP BY, COUNT DISTINCT, time-series aggregation
Data patternNormalized schema, frequent updates/deletesAppend-only events, logs, clickstream
Team< 5 engineers, no dedicated data infra5+ with data/infra capacity
Query SLADashboard < 5s is acceptableSub-second dashboard required
BudgetSingle-DB simplicity preferredCan absorb 2× infra + maintenance cost

The 50M–100M row zone is genuinely ambiguous. Below 50M, ClickHouse overhead outweighs the benefit. Above 500M rows for aggregation-heavy queries, ClickHouse is usually the right call. In between, benchmark your specific queries before committing.

Migration paths

Dual-write

The application writes to Postgres (source of truth) and ClickHouse (analytics layer) simultaneously. Simple to implement. The risk: write logic complexity increases, and divergence on failures requires reconciliation logic.

CDC via PeerDB or ClickPipes

PeerDB (self-hosted) and ClickPipes (ClickHouse Cloud managed) consume Postgres logical replication and materialize changes into ClickHouse near-real-time. PeerDB moved 1 TB of Postgres data in 2 hours. Both handle INSERT, UPDATE, and DELETE replication.

This is the recommended path for most production migrations. Postgres stays as the operational system; ClickHouse serves the analytics layer.

Batch ETL

Periodic export from Postgres (COPY or custom queries) → flatten/transform → bulk INSERT into ClickHouse. Simpler toolchain, introduces lag. Acceptable if your analytics can tolerate a 15-minute delay.

Direct query via postgresql() table function

For one-time migrations or infrequent syncs:

INSERT INTO clickhouse_table
SELECT * FROM postgresql('host:5432', 'mydb', 'events', 'user', 'password')
WHERE created_at > '2024-01-01'

Useful for moving historical data without maintaining a permanent CDC pipeline.

Verdict

Use Postgres until you can’t. When dashboard queries consistently hit 5–10 seconds on 100M+ rows and you’ve already tried partitioning and TimescaleDB, ClickHouse is the right next step.

Use casePick
< 50M rows, mixed OLTP + OLAPPostgres
50M–500M rows, time-seriesPostgres + TimescaleDB (evaluate ClickHouse if queries are slow)
> 100M rows, aggregation-heavy, append-onlyClickHouse
Normalized schema with 5+ JOINsPostgres — ClickHouse will hurt you
Dedicated analytics pipeline with team supportClickHouse
Small team, single-DB simplicity requiredPostgres

Real-world outcomes align with this matrix. Skool went from minutes to seconds on 100M+ rows/day with no optimization effort. PostHog migrated away from Postgres after it couldn’t handle scale. Reco removed 90% of its Postgres workload and reduced cost. Shopee searches 30B+ rows in seconds for distributed tracing. The pattern is consistent: ClickHouse wins at scale for analytics, Postgres wins for operational data. Most production teams end up running both.

Hosting

If ClickHouse is the right call, ClickHouse Cloud handles the operational overhead — no Keeper setup, no cluster management. Pricing starts at ~$0.22/compute-unit-hour plus storage; note that January 2025 brought a ~30% increase over prior tiers, and there is no permanent free tier (14–30 day trial only).

Self-hosted ClickHouse on a single 16-core / 64 GB RAM server handles 10–50B rows without clustering overhead. For teams in Vietnam or Southeast Asia evaluating cost, a $100/month VPS is a viable entry point for most SaaS-scale workloads.

If you decide to stay on Postgres, Neon and Supabase offer production-grade managed Postgres. Neon’s serverless model suits bursty analytics workloads with scale-to-zero; Supabase bundles auth, realtime, and storage for teams that want a full backend platform.

Caveats

  • Benchmarks cited above come from public primary sources; hardware, query shape, and schema choices affect results. Benchmark your specific workload before migrating.
  • ClickHouse schema design matters more than in Postgres. The 55× improvement from an ORDER BY change in the fiveonefour benchmark is a warning: getting the schema wrong is expensive to fix later.
  • ClickHouse Cloud pricing changes over time — verify current rates before committing to a budget estimate.
  • This article contains affiliate links (ClickHouse Cloud, Neon, Supabase). Reviewed independently; affiliate status does not affect the verdict.

References