Best Vector Database in 2026: Qdrant, Vectorize, Pinecone

Qdrant if you’re building at the edge or need production-grade hybrid search — it’s the only client in this roundup that runs on Cloudflare Workers natively and ships DBSF fusion out of the box. Cloudflare Vectorize if you’re already on Workers and want the vector store co-located with zero infrastructure to operate. Pinecone if you want a managed namespace model with predictable per-GB billing and the smallest possible API surface. Weaviate if benchmark QPS is a real requirement. pgvector if you’re on Postgres and 1M vectors is your ceiling. Chroma if you’re prototyping locally and want the same client to reach cloud later without touching your code. Here’s how each vector database compares across pricing, hybrid search, edge compatibility, and operational overhead.

Who this is for

Developers choosing a vector store for a RAG pipeline, semantic search feature, or recommendation system. If you’re evaluating keyword search tools, Best Search-as-a-Service is the right comparison. If you’re already on Postgres and only asking whether pgvector is sufficient, that question is covered more directly in How to set up vector search with pgvector.

How we evaluated

Five criteria, weighted by what production deployments actually break on:

Edge runtime: Does the JavaScript client run on Cloudflare Workers without Node.js polyfills?
Hybrid search: Dense + sparse fusion — how complete and how GA?
Pricing: Per-GB, per-dimension, per-node — and what 1M × 1536d actually costs per month.
DX: SDK quality, free-tier usability, time from sign-up to first query.
Operational overhead: Serverless, managed cloud, or Kubernetes.

No neutral benchmark was run. Where benchmark numbers appear (Weaviate), they are vendor-produced — methodology disclosed in that section.

Vector database pricing at a glance

Provider	Free tier	~$/mo at 1M × 1536d	Model
Qdrant Cloud	0.5 vCPU / 1GB RAM / 4GB disk	~$25–40 (1 vCPU node)	Node-based
Weaviate Cloud	None	~$66 (Flex $45 + ~$21 storage)	Per-dimension + SLA tier
Pinecone	Starter: free / 2GB	~$20 (Builder / 10GB)	Per-GB storage
Cloudflare Vectorize	5M stored dims (~3,255 vectors)	~$6 (Paid $5 + ~$0.77 storage)	Per-stored-dim + per-queried-dim
pgvector (Supabase)	Free (shared, 500MB DB)	~$410 (2XL: 8-core ARM / 32GB RAM)	Compute + storage
Chroma Cloud	$5 credits	~$2 (storage ongoing)	Usage-based

¹ Weaviate Flex storage: 1M × 1536 = 1,536M dims × $0.0139/1M dims = ~$21/mo; plus $45/mo Flex plan. Verified at weaviate.io/pricing (May 2026).

² Vectorize: 1,536M stored dims, first 10M included on Paid plan, remainder × $0.05/100M = ~$0.77/mo. Workers Paid base is $5/mo. Verified at developers.cloudflare.com/vectorize/platform/pricing/ (May 2026).

³ pgvector Supabase: the 2XL tier (8-core ARM, 32GB RAM) is the smallest Supabase compute with enough RAM to keep a 1M × 1536d HNSW index warm in cache. Source: Supabase blog, May 2026.

Qdrant

Qdrant is the strongest all-around pick in 2026 if you need hybrid search or edge compatibility.

Hybrid search is mature. Since v1.11.0, Qdrant ships Distribution-Based Score Fusion (DBSF) alongside the older Reciprocal Rank Fusion (RRF). Most databases that claim hybrid search hand you RRF and stop there — or push fusion into your application layer entirely. DBSF outperforms RRF on datasets where dense and sparse score distributions differ significantly (which is most real retrieval workloads, not synthetic benchmarks). You configure it declaratively via the query API:

const result = await client.query('collection-name', {
  prefetch: [
    { query: sparseVector, using: 'sparse', limit: 20 },
    { query: denseVector, using: 'dense', limit: 20 },
  ],
  query: { fusion: 'dbsf' },
  limit: 10,
});

No external fusion logic. No extra round-trips.

The JS SDK runs on Workers. @qdrant/js-client-rest uses REST transport only — no gRPC, no @grpc/grpc-js, no Node.js http2 bindings. It works on Cloudflare Workers, Deno Deploy, and browsers. The Workers-specific integration demo lives at github.com/elithrar/qdrant-workers-demo. This is a non-trivial differentiator: the same application pattern that runs locally can deploy to the edge without a proxy or adapter layer.

The free tier (0.5 vCPU, 1GB RAM, 4GB disk, no credit card) is enough to test the API and build a proof of concept. It won’t hold 1M × 1536d vectors — you need a paid node for production scale, starting around $25–40/mo for 1 vCPU / 4GB configuration.

Qdrant Hybrid Cloud targets self-hosted Kubernetes deployments and supports 17 named K8s platforms. If you go that route, expect K8s operational burden — Hybrid Cloud is not a serverless offering.

Weaviate

The headline number from Weaviate’s published ANN benchmark: 5,639 QPS at 97.24% Recall@10, 2.80ms mean latency, 4.43ms P99. That’s on GCP n4-highmem-16 (16 vCPU, 128GB RAM) with the DBPedia ada002 dataset (1M vectors, 1536 dims).

Required disclosure: this is vendor-produced, not independently replicated. The hardware used ($1,200+/mo equivalent GCP compute) costs more than most Weaviate Cloud plans. If your benchmark spreadsheet needs a QPS number and your team will run its own tests against your actual workload, the Weaviate benchmark is a plausible directional ceiling — not a baseline.

What it does confirm: Weaviate’s HNSW implementation handles write throughput and concurrent queries well when memory is abundant. If you’re running at scale with the budget for Premium ($400/mo, 99.95% SLA) or higher, the throughput headroom is real.

Pricing on Weaviate Cloud is per-dimension per month. Flex is $45/mo (99.5% SLA) with storage at $0.0139/1M dims — for 1M × 1536d that’s ~$21/mo in storage before the plan fee, ~$66/mo total. Scale to 10M vectors and the storage bill alone reaches ~$213/mo. The per-dimension model is predictable but can surprise you at 10M+ scale.

The TypeScript v3 client requires Node.js. It depends on @grpc/grpc-js, which binds to Node’s http2 implementation. Workers, Deno, and browser environments are not supported. GitHub issue #145 is open and unresolved as of May 2026. If your runtime isn’t Node, you fall back to the REST-only v2 API with a smaller feature surface.

Pinecone

Pinecone’s product thesis is managed simplicity. No infrastructure decisions, no replication config, no pod sizing. You create a serverless index, upsert vectors, query. The API surface is intentionally narrow — that’s the design, not an oversight.

Pricing tiers:

Starter: free, 2GB storage limit — enough for demos and small prototypes
Builder: $20/mo, 10GB — covers 1M × 1536d (~6GB) with room
Standard: $50/mo + $0.33/GB overage
Enterprise: $500/mo + $0.33/GB

At 1M vectors × 1536 dims (~6GB), Builder is the right tier at $20/mo. Standard makes sense at 15GB+.

Hybrid search (dense + sparse) is GA across all plans — Starter through Enterprise. Dense, Sparse, and Full-Text index types are listed as available on pinecone.io/pricing with no preview designation as of May 2026. Qdrant still has the more sophisticated fusion implementation (DBSF + RRF, configurable per query); Pinecone’s sparse index support covers standard dense+sparse hybrid retrieval but doesn’t expose fusion algorithm control.

A note on affiliate links: no Pinecone tracking slug exists in the toolchew link table yet. Direct link for now: pinecone.io/pricing.

Cloudflare Vectorize

Vectorize went GA in September 2024. It’s fully serverless — no nodes, no pods, no index provisioning. You define an index in wrangler.toml, bind it to a Worker, and query with a single line:

const results = await env.MY_VECTORIZE_INDEX.query(queryVector, { topK: 10 });

No external HTTP request. The query runs within the same Cloudflare edge datacenter as your Worker. The co-location latency advantage is real for latency-sensitive pipelines — a round-trip to an external vector DB at the edge adds 20–80ms; a Vectorize query adds roughly 0ms. For a broader look at where Vectorize fits alongside D1, Turso, and other edge data stores, see Edge database tradeoffs: when latency is a lie.

Pricing on Workers Paid ($5/mo base): first 10M stored dims included, then $0.05 per 100M; first 50M queried dims/mo included, then $0.01 per 1M. At 1M vectors × 1536 dims: ~$0.77/mo for storage. It’s the cheapest production option in this comparison by a significant margin.

The free tier is 5M stored vector dimensions, which is ~3,255 vectors at 1536 dims. That’s enough for a demo, not a production workload.

Where it falls short: no built-in hybrid search. Dense vector search only. Adding sparse + dense requires a parallel keyword index (Workers KV or a Hyperdrive-connected Postgres) and application-layer fusion. That’s implementation burden Qdrant and Weaviate absorb. If your retrieval quality depends on hybrid search, Vectorize is not ready as a standalone solution.

pgvector

pgvector is the correct answer when vector search is not your primary feature and you’re not prepared to add and operate another system.

If you’re already on Neon, Supabase, or RDS, enabling the extension costs one line:

CREATE EXTENSION IF NOT EXISTS vector;

If you haven’t settled on a Postgres host yet, Neon vs Supabase covers the cold-start, pricing, and DX differences between the two most developer-friendly serverless options.

The cost of a Supabase 2XL instance (8-core ARM, 32GB RAM) capable of indexing 1M × 1536d vectors in RAM is ~$410/mo. That sounds expensive until you consider you’d be running a Postgres instance at scale anyway — the vector capability is a feature of compute you’re already paying for, not a new line item.

Two production caveats that matter:

The optimizer flaw. A documented cost model issue causes the query planner to prefer sequential scans over the HNSW index in certain filtered workloads (pgvector GitHub issues). The common workaround — SET enable_seqscan = off on affected query paths — works but requires you to monitor your EXPLAIN ANALYZE output actively. This isn’t a reason to avoid pgvector; it’s a reason to have observability on your query plans.

RAM is the ceiling. pgvector’s HNSW index is memory-resident. At 1M × 1536d you need 32GB of RAM to keep the index warm in cache. The 0.8.0 support for iterative index scans (CHANGELOG) helps with cold-start latency but doesn’t eliminate the RAM requirement. If you’re sizing compute around the vector index, that’s a cost driver that dedicated vector databases with on-disk HNSW (Qdrant, Weaviate) don’t impose in the same way.

Pre-PMF and internal tools: use pgvector. When search quality becomes a user-facing differentiator, the migration to a dedicated vector DB is well-understood.

Chroma

Chroma is the prototyping-first option. The client API is identical whether you’re hitting an in-memory instance during tests, a local server, or Chroma Cloud. That API parity means no client code changes when you push from laptop to cloud.

Chroma Cloud pricing is usage-based: $2.50/GiB to write data, $0.33/GiB/mo for storage, $0.0075/TiB queried. For 1M × 1536d (~6GB): the initial write costs ~$15 one time; ongoing storage runs ~$2/mo. That’s the cheapest cloud option in this roundup at that vector count.

Hybrid search isn’t built in. The Team plan ($250/mo + usage) adds SLA and higher limits. For prototyping and internal tooling with a small-to-medium vector corpus, Chroma is a reasonable fit. For production RAG at scale, most teams graduate to a dedicated system before the pricing becomes a concern.

Current version: check github.com/chroma-core/chroma — the pricing page doesn’t surface release numbers.

Verdict

Use case	Pick	Why
Edge + Workers deployment	Qdrant	Only production-ready Workers-compatible client (REST, no Node.js)
Zero infra, already on Workers	Cloudflare Vectorize	~$0.77/mo storage at 1M vectors; native Workers binding
Smallest API surface, managed	Pinecone	Clean serverless index model; ~$20/mo at 1M × 1536d
Throughput benchmark priority	Weaviate	Highest published QPS — vendor-measured, disclose this
Postgres-native, search secondary	pgvector	No added dependency; memory-resident HNSW has RAM ceiling
Local prototype → cloud	Chroma	Same client everywhere; ~$2/mo ongoing storage
Hybrid search, production	Qdrant	DBSF + RRF fusion, GA since v1.11.0; Pinecone also has GA sparse indexes but no fusion algorithm control

If hybrid search is a day-one requirement and fusion algorithm control matters, Qdrant is the most complete production option. Pinecone’s sparse indexes are now GA and cover the standard dense+sparse case if you prefer its managed model. If you’re already on Workers and only need dense search, Vectorize is the cheapest path with the least infrastructure debt.

Caveats

The Weaviate benchmark (5,639 QPS) is vendor-produced, tested on GCP n4-highmem-16 with the DBPedia ada002 corpus. Results on your hardware, document shape, and query mix will differ. Do not use this number as a planning baseline without running your own test.

pgvector costs are from a Supabase comparison post (May 2026) for the Supabase 2XL compute tier. Neon bills differently (compute-seconds + storage) and may be cheaper for bursty workloads — verify at neon.com/pricing.

Pinecone pricing verified at pinecone.io/pricing (May 2026). Affiliate partnership pending — links in this article are direct, not tracked.

Cloudflare Vectorize pricing verified at developers.cloudflare.com/vectorize/platform/pricing/ (May 2026).

Chroma current version was not confirmed at time of writing — check github.com/chroma-core/chroma.

No Milvus / Zilliz Cloud data was available from the research pass with sufficient confidence to include — verified pricing and feature data for Milvus 2.4+ was not sourced.