The Retention Problem#

Prometheus stores metrics on local disk with a default retention of 15 days. Most production teams extend this to 30 or 90 days, but local storage has hard limits. A single Prometheus instance cannot scale disk beyond the node it runs on. It provides no high availability – if the instance goes down, you lose scraping and query access. And each Prometheus instance only sees its own targets, so there is no unified view across clusters or regions.

These limitations matter as soon as you need to answer questions like: “What was our p99 latency six months ago?”, “How do request patterns compare across all three clusters?”, or “Can we survive a Prometheus instance failure without alerting gaps?”

The solution is a long-term metrics storage backend that sits behind Prometheus. The general architecture is consistent across all three options: Prometheus scrapes targets as usual, sends or exposes data to the backend, the backend handles durable storage and cross-instance querying, and Grafana queries the backend instead of (or in addition to) individual Prometheus instances.

Thanos#

Thanos extends an existing Prometheus deployment by bolting on components that provide long-term storage and global querying. It was designed to keep Prometheus central to the architecture while solving its scaling limitations.

Core components:

  • Sidecar: runs alongside each Prometheus instance. It uploads completed TSDB blocks to object storage (S3, GCS, Azure Blob) and serves real-time data that has not yet been uploaded. This is the lightest touch – your Prometheus instances continue operating normally.
  • Store Gateway: reads historical blocks from object storage and serves them to queries. It caches block metadata and index data in memory for faster lookups.
  • Compactor: runs as a singleton. It compacts uploaded blocks (merging small blocks into larger ones) and performs downsampling. Raw data can be downsampled to 5-minute and 1-hour resolutions for efficient long-term queries.
  • Query (Querier): the PromQL gateway. It fans out queries to Sidecars (for recent data) and Store Gateways (for historical data), deduplicates results from HA Prometheus pairs, and returns a unified result set.
  • Ruler: evaluates recording rules and alerts against the global dataset in object storage, not just a single Prometheus instance’s data.

Data flow:

Prometheus + Sidecar --upload--> S3/GCS bucket
                                     |
                              Store Gateway
                                     |
Sidecar (live data) ----+----> Thanos Query <---- Grafana
                                     |
                              Compactor (background)

Choose Thanos when you want to keep your existing Prometheus setup largely unchanged and layer long-term storage on top. It is a strong fit for multi-cluster environments where each cluster runs its own Prometheus and you want a global query view. Downsampling is a significant advantage for cost – 1-hour resolution data for queries spanning months takes a fraction of the storage.

Limitations: Thanos is operationally complex. You are running five or more additional components, each with its own configuration and failure modes. Object storage costs accumulate, and query latency for very old data depends on Store Gateway cache hit rates. The Compactor is a critical singleton – if it falls behind, object storage grows unbounded and queries slow down.

Grafana Mimir#

Grafana Mimir is a horizontally scalable, multi-tenant TSDB that replaces Prometheus storage entirely. Prometheus pushes data to Mimir via remote_write, and Mimir handles everything from there: ingestion, storage, compaction, and querying.

Mimir uses the same block storage format as Prometheus TSDB, so it is not a foreign system. It supports two deployment modes:

  • Monolithic: a single binary running all components. Simple to deploy and operate. Suitable for up to roughly 500K active series.
  • Microservices: each component (distributor, ingester, compactor, store-gateway, querier, query-frontend) runs independently and scales horizontally. Required above 500K series or for production multi-tenant deployments.

Data flow:

Prometheus --remote_write--> Mimir Distributor --> Ingester --> Object Storage
                                                                    |
                             Grafana <-- Query Frontend <-- Querier <-- Store Gateway

Configuration in Prometheus:

remote_write:
  - url: "http://mimir-distributor:9009/api/v1/push"
    headers:
      X-Scope-OrgID: "tenant-1"

Choose Mimir when you want a single scalable backend rather than bolting on to Prometheus. It is the best fit for multi-tenant platforms where different teams or customers need isolated metrics with per-tenant limits. The Grafana ecosystem integration is seamless – Grafana, Loki, and Tempo all share architectural patterns and operational tooling.

Limitations: Mimir is resource-heavy for small deployments. The microservices mode involves many components, each needing resource tuning. It is newer than Thanos in the CNCF ecosystem, though it descends from Cortex, which has years of production use.

VictoriaMetrics#

VictoriaMetrics takes a different approach: a single-binary or clustered TSDB that is Prometheus-compatible but built from scratch for efficiency. It accepts data via remote_write, Prometheus exposition format, InfluxDB line protocol, Graphite, OpenTSDB, and several other formats.

Key differentiators:

  • Compression: 7-10x better compression than Prometheus. A dataset requiring 100GB in Prometheus may take 10-15GB in VictoriaMetrics.
  • Performance: lower CPU and memory usage per series. Benchmarks consistently show 2-5x lower resource consumption than comparable backends.
  • MetricsQL: a superset of PromQL that adds functions like range_median, median_over_time, limitk, and label manipulation functions. All valid PromQL works unchanged.
  • Single-node deployment: one binary, one data directory. No object storage required (stores on local disk by default). Cluster mode adds vmstorage, vminsert, and vmselect components for horizontal scaling.

Data flow (single-node):

Prometheus --remote_write--> VictoriaMetrics (single binary) <---- Grafana (PromQL)

Choose VictoriaMetrics when you want the simplest possible deployment with the lowest resource cost. It is excellent for cost-conscious environments and high ingestion rates. A single-node VictoriaMetrics instance can handle millions of active series on modest hardware.

Limitations: MetricsQL extensions, while convenient, create vendor lock-in if you rely on them in dashboards and alerts. The community is smaller than Thanos. Multi-tenancy exists but is less mature than Mimir’s implementation.

Comparison Table#

Criterion Thanos Grafana Mimir VictoriaMetrics
Architecture Bolt-on to Prometheus Standalone scalable TSDB Standalone efficient TSDB
Query language PromQL PromQL MetricsQL (PromQL superset)
Storage backend Object storage (S3/GCS) Object storage (S3/GCS) Local disk or object storage
Multi-tenancy Limited (external label) Native, per-tenant limits Supported (cluster mode)
Downsampling Yes (5m, 1h) No (relies on compaction) Yes (with vmagent)
Deduplication Yes (HA pairs) Yes (replication) Yes (dedup flag)
Operational complexity High (5+ components) Medium-High (microservices) Low (single binary)
Resource efficiency Moderate Moderate-High High (best compression)
Ecosystem CNCF, broad adoption Grafana Labs ecosystem Independent, growing

Cost Modeling#

For a reference deployment of 1 million active series with 1-year retention:

Thanos: storage cost is dominated by the object storage bucket. At roughly 1.5 bytes per sample with 15-second scrape intervals, raw storage is approximately 3TB/year before compaction. After compaction and downsampling, expect 500GB-1TB in S3. S3 cost is roughly $23/TB/month, so $12-23/month for storage. Compute for Sidecar, Store Gateway, Compactor, and Querier adds 4-8 CPU cores and 16-32GB RAM total.

Grafana Mimir: similar object storage costs to Thanos (same block format). Microservices mode typically needs more compute – 6-12 CPU cores and 24-48GB RAM across all components. Ingesters require fast local disks for the write-ahead log.

VictoriaMetrics: with 7-10x compression, the same 1M series dataset fits in 300-500GB on local disk. No object storage costs. A single-node instance handling this load needs 2-4 CPU cores and 8-16GB RAM. This is typically the lowest total cost.

Common Gotchas#

The Thanos Compactor is not optional. Without it running continuously, uploaded blocks accumulate without merging. Object storage grows unbounded, Store Gateway queries slow down because they scan more blocks, and downsampling never happens. Monitor thanos_compact_group_compactions_total and alert if it stops increasing.

Mimir in monolithic mode works well for getting started, but do not run it that way past 500K active series. The transition to microservices mode requires re-architecting the deployment, so plan for it early if you expect growth. Use the Mimir sizing calculator to right-size component resources.

When migrating from Prometheus to any backend, do not attempt to backfill historical data as a first step. Start with remote_write for new data, run both systems in parallel for the retention period of your Prometheus instances, and then cut over Grafana queries to the new backend once it has enough history.