· clickhouse / postgresql / analytics
ClickHouse vs Postgres for analytics
If dashboard queries on 100M+ rows are hitting 8 seconds, ClickHouse is worth looking at. Here is what the benchmarks show about the actual break-even point.
By Ethan
1,761 words · 9 min read
If your Postgres tables are under 50 million rows, stop here — Postgres with proper indexes will serve you fine. At 100 million rows and above, ClickHouse starts winning by margins you can feel: 8× to 361× faster on analytical queries, 10× less storage on the same dataset. The operational cost is real. This article tells you where the break-even sits and what the migration looks like.
Who this is for
Engineering teams running analytics on a Postgres table that used to be fast and now isn’t. Or teams planning an event pipeline that will hit 100M+ rows within the year and want to avoid a painful migration later.
If your primary concern is transactional integrity, complex joins across a normalized schema, or a team of fewer than five with no dedicated data engineer — stay on Postgres. The verdict has the full matrix.
What ClickHouse is
ClickHouse is a column-store database built for OLAP. The key difference from Postgres: instead of storing each row as a unit, it stores each column as a unit. A SUM(revenue) query never reads user_agent or session_id — it reads only the bytes it needs.
The default storage engine is MergeTree. Data lands in columnar parts on disk and is continuously merged in the background. Deduplication, pre-aggregation, and TTL expiration all happen during background merges rather than at query time. The tradeoff: very fast reads, but writes are append-only in practice. Row-level updates are batch rewrites, not point mutations.
ClickHouse also applies LZ4 compression by default, with optional domain-specific codecs (Delta for timestamps, Gorilla for floats). The same 100M-row dataset that takes ~100 GiB in Postgres takes ~9.26 GiB in ClickHouse — roughly 10× compression, measured on the ClickBench dataset (99,997,497 rows, AWS c6a.4xlarge, ClickHouse 25.7 vs PostgreSQL 17.5).
Head-to-head
Query speed
The fiveonefour.com micro-benchmark ran a 4-query analytical workload (aggregation, statistics, sampling) on aircraft tracking data across 100 iterations. Both databases on equivalent 4 GB / 2 CPU Docker containers:
| Row count | ClickHouse | Postgres + index | Winner |
|---|---|---|---|
| 10K | 27.9 ms | 12.9 ms | Postgres (2.2×) |
| 50K | 75.5 ms | 60.5 ms | Roughly tied |
| 1M | 121.1 ms | 1,067 ms | ClickHouse (8.8×) |
| 10M | 453 ms | 7,595 ms | ClickHouse (16.8×) |
Postgres wins on small datasets. The crossover is around 50K–100K rows for analytical workloads. ClickHouse accelerates from there.
On the UK house price dataset (~5 million rows), the gap is starker:
| Query | Postgres | ClickHouse | Speedup |
|---|---|---|---|
| Average price (5.01M rows) | 28,535 ms | 79 ms | 361× |
| City postcodes lookup (424K rows) | 543 ms | 77 ms | 7× |
| London price change over time | 8,904 ms | 62 ms | 144× |
At 5M rows for a full-table aggregation, Postgres takes 28 seconds. ClickHouse takes 79 ms. That’s not a marginal improvement — that’s a different experience category.
A note on schema design: the fiveonefour benchmark found that a single ORDER BY reorder dropped a ClickHouse query from 4,191 ms to 75 ms on 10M rows — a 55× improvement from one schema change. The engine rewards correct schema design and punishes the defaults if you get them wrong.
Ingestion throughput
ClickHouse is built for high-throughput batch inserts. Shopee’s distributed tracing system ingests 3 million rows per second on 20 CPU cores, searching 30B+ rows in seconds. Cloudflare inserts 11 million rows per second into ClickHouse across HTTP and DNS analytics pipelines. These are real production numbers, not lab benchmarks.
Postgres can handle bulk inserts efficiently, but at scale it falls behind: every write must maintain MVCC state, WAL records, and index updates in parallel. ClickHouse writes to a columnar part on disk and defers the rest to background merges.
For append-only workloads (event streams, logs, metrics), this is excellent. For workloads needing frequent UPDATE or DELETE, ClickHouse’s batch-mutation model will cause friction.
Storage
ClickBench figures on the same 100M-row dataset:
- ClickHouse: 9.26 GiB (LZ4 columnar)
- PostgreSQL: ~100 GiB (row-store with default TOAST)
At 10 TB of Postgres analytics data, expect roughly 1 TB in ClickHouse. Storage costs compound quickly at scale.
JOINs and normalized schemas
This is where ClickHouse loses. Postgres handles normalized schemas and multi-table JOINs naturally. ClickHouse lacks data shuffling for large distributed JOINs — more than 3–4 JOINs can cause RAM spikes or crashes. The idiomatic ClickHouse pattern is to denormalize into wide flat tables during ETL, which means the data model looks nothing like a normalized Postgres schema.
If your analytics queries are deeply joined, you either restructure the data model (work) or absorb the performance penalty (pain). Neither is free.
Operational complexity
Postgres is one daemon. ClickHouse on a single node is manageable. Distributed ClickHouse requires ClickHouse Keeper (a Raft-based coordination layer), separate monitoring, and a different operational model.
There’s also the ecosystem gap. Postgres has 30 years of tooling — pgAdmin, ORMs, Liquibase, PostgREST, and proven ACID guarantees. ClickHouse is a second database to run, not a Postgres replacement.
Postgres is not finished
Before moving to ClickHouse, check what Postgres can do on its own:
- Range partitioning + BRIN indexes: BRIN indexes run ~100 KB for a timestamp column. Combined with time-based partitioning, range scans skip entire month/year partitions without a full table scan.
- TimescaleDB: Columnar compression and automatic time partitioning inside Postgres. Claims 1,000× improvement on time-series workloads versus vanilla Postgres.
- pg_mooncake: Columnstore tables within Postgres. Placed in the ClickBench top 10 as of 2025/2026.
With TimescaleDB or pg_mooncake, Postgres can stay competitive up to ~500M rows for well-structured time-series. The break-even isn’t row count alone — it’s row count combined with query complexity, concurrency, and team capacity.
Break-even analysis
| Factor | Stay on Postgres | Add ClickHouse |
|---|---|---|
| Row count | < 50M | > 100M |
| Query pattern | Point lookups, recent-N rows, complex JOINs | GROUP BY, COUNT DISTINCT, time-series aggregation |
| Data pattern | Normalized schema, frequent updates/deletes | Append-only events, logs, clickstream |
| Team | < 5 engineers, no dedicated data infra | 5+ with data/infra capacity |
| Query SLA | Dashboard < 5s is acceptable | Sub-second dashboard required |
| Budget | Single-DB simplicity preferred | Can absorb 2× infra + maintenance cost |
The 50M–100M row zone is genuinely ambiguous. Below 50M, ClickHouse overhead outweighs the benefit. Above 500M rows for aggregation-heavy queries, ClickHouse is usually the right call. In between, benchmark your specific queries before committing.
Migration paths
Dual-write
The application writes to Postgres (source of truth) and ClickHouse (analytics layer) simultaneously. Simple to implement. The risk: write logic complexity increases, and divergence on failures requires reconciliation logic.
CDC via PeerDB or ClickPipes
PeerDB (self-hosted) and ClickPipes (ClickHouse Cloud managed) consume Postgres logical replication and materialize changes into ClickHouse near-real-time. PeerDB moved 1 TB of Postgres data in 2 hours. Both handle INSERT, UPDATE, and DELETE replication.
This is the recommended path for most production migrations. Postgres stays as the operational system; ClickHouse serves the analytics layer.
Batch ETL
Periodic export from Postgres (COPY or custom queries) → flatten/transform → bulk INSERT into ClickHouse. Simpler toolchain, introduces lag. Acceptable if your analytics can tolerate a 15-minute delay.
Direct query via postgresql() table function
For one-time migrations or infrequent syncs:
INSERT INTO clickhouse_table
SELECT * FROM postgresql('host:5432', 'mydb', 'events', 'user', 'password')
WHERE created_at > '2024-01-01'
Useful for moving historical data without maintaining a permanent CDC pipeline.
Verdict
Use Postgres until you can’t. When dashboard queries consistently hit 5–10 seconds on 100M+ rows and you’ve already tried partitioning and TimescaleDB, ClickHouse is the right next step.
| Use case | Pick |
|---|---|
| < 50M rows, mixed OLTP + OLAP | Postgres |
| 50M–500M rows, time-series | Postgres + TimescaleDB (evaluate ClickHouse if queries are slow) |
| > 100M rows, aggregation-heavy, append-only | ClickHouse |
| Normalized schema with 5+ JOINs | Postgres — ClickHouse will hurt you |
| Dedicated analytics pipeline with team support | ClickHouse |
| Small team, single-DB simplicity required | Postgres |
Real-world outcomes align with this matrix. Skool went from minutes to seconds on 100M+ rows/day with no optimization effort. PostHog migrated away from Postgres after it couldn’t handle scale. Reco removed 90% of its Postgres workload and reduced cost. Shopee searches 30B+ rows in seconds for distributed tracing. The pattern is consistent: ClickHouse wins at scale for analytics, Postgres wins for operational data. Most production teams end up running both.
Hosting
If ClickHouse is the right call, ClickHouse Cloud handles the operational overhead — no Keeper setup, no cluster management. Pricing starts at ~$0.22/compute-unit-hour plus storage; note that January 2025 brought a ~30% increase over prior tiers, and there is no permanent free tier (14–30 day trial only).
Self-hosted ClickHouse on a single 16-core / 64 GB RAM server handles 10–50B rows without clustering overhead. For teams in Vietnam or Southeast Asia evaluating cost, a $100/month VPS is a viable entry point for most SaaS-scale workloads.
If you decide to stay on Postgres, Neon and Supabase offer production-grade managed Postgres. Neon’s serverless model suits bursty analytics workloads with scale-to-zero; Supabase bundles auth, realtime, and storage for teams that want a full backend platform.
Caveats
- Benchmarks cited above come from public primary sources; hardware, query shape, and schema choices affect results. Benchmark your specific workload before migrating.
- ClickHouse schema design matters more than in Postgres. The 55× improvement from an ORDER BY change in the fiveonefour benchmark is a warning: getting the schema wrong is expensive to fix later.
- ClickHouse Cloud pricing changes over time — verify current rates before committing to a budget estimate.
- This article contains affiliate links (ClickHouse Cloud, Neon, Supabase). Reviewed independently; affiliate status does not affect the verdict.
References
- ClickBench — official OLAP benchmark
- ClickBench methodology (GitHub)
- fiveonefour.com — Postgres vs ClickHouse micro-benchmark
- UK house price migration benchmark
- Update performance benchmark — 600M rows TPC-H
- Why ClickHouse is fast
- Skool case study
- PostHog: ClickHouse vs Postgres
- Reco case study
- Shopee distributed tracing with ClickHouse
- Cloudflare HTTP Analytics — 11M rows/sec with ClickHouse
- TimescaleDB vs vanilla PostgreSQL — 1,000× faster queries benchmark
- PeerDB 1TB migration in 2 hours
- JOIN limitations in ClickHouse
- ClickHouse pricing analysis
- PostgreSQL partitioning docs