Time-Series Database Selection and Operations#

Time-series databases optimize for a specific access pattern: high-volume writes of timestamped data points, queries that aggregate over time ranges, and automatic expiration of old data. Choosing the right one depends on your data model, query patterns, retention requirements, and operational constraints.

When You Need a Time-Series Database#

A dedicated time-series database is justified when you have high write throughput (thousands to millions of data points per second), queries that are predominantly time-range aggregations, and data that has a defined retention period. Common use cases: infrastructure metrics, application performance monitoring, IoT sensor data, financial tick data, and log analytics.

InfluxDB#

Overview#

InfluxDB is purpose-built for time-series. InfluxDB 3.x (current) uses Apache Arrow and DataFusion under the hood, supports SQL and InfluxQL, and stores data in Parquet files on object storage.

InfluxDB 2.x (still widely deployed) uses the Flux query language and its own TSM storage engine. InfluxDB 1.x used InfluxQL and is considered legacy.

When to Choose InfluxDB#

You want a standalone time-series database without managing PostgreSQL.
Your team prefers a purpose-built solution with built-in retention and downsampling.
Data model fits the tag/field/measurement paradigm.
InfluxDB Cloud is acceptable (InfluxDB 3.x is cloud-first).

Data Model#

measurement,tag1=value1,tag2=value2 field1=1.0,field2="text" timestamp

Example line protocol:

cpu,host=server01,region=us-east usage_idle=95.2,usage_user=4.3 1708617600000000000

Tags are indexed (use for filtering), fields are not indexed (use for values). High-cardinality tags (unique user IDs, session IDs) cause performance problems – keep tag cardinality under control.

Retention and Downsampling (InfluxDB 2.x)#

influx bucket create --name metrics --retention 30d

Downsampling uses Flux tasks that read from one bucket and write aggregated data to another. In InfluxDB 3.x, downsampling is handled through processing rules.

Query Performance Characteristics#

InfluxDB excels at time-range queries with tag filters. It struggles with high-cardinality tag sets and ad-hoc joins between measurements. Query latency is typically sub-second for recent data with low-to-medium cardinality.

TimescaleDB#

Overview#

TimescaleDB is a PostgreSQL extension that adds automatic time-based partitioning (hypertables), columnar compression, continuous aggregates, and retention policies. It runs as part of PostgreSQL – you get full SQL, joins, transactions, and the PostgreSQL ecosystem.

When to Choose TimescaleDB#

You already run PostgreSQL and want time-series capabilities without new infrastructure.
Queries require SQL joins between time-series and relational data.
You need ACID transactions on time-series data.

Setup#

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS timescaledb;

-- Create a regular table
CREATE TABLE metrics (
  time TIMESTAMPTZ NOT NULL,
  host TEXT NOT NULL,
  cpu_usage DOUBLE PRECISION,
  memory_usage DOUBLE PRECISION,
  disk_io DOUBLE PRECISION
);

-- Convert to a hypertable (auto-partitioned by time)
SELECT create_hypertable('metrics', 'time');

-- Optional: add a space dimension for multi-tenant partitioning
SELECT add_dimension('metrics', 'host', number_partitions => 4);

Retention Policies#

-- Automatically drop chunks older than 30 days
SELECT add_retention_policy('metrics', INTERVAL '30 days');

-- View active policies
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- Manually drop old data
SELECT drop_chunks('metrics', older_than => INTERVAL '90 days');

Continuous Aggregates (Downsampling)#

-- Create a materialized view that auto-refreshes hourly rollups
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
  SELECT
    time_bucket('1 hour', time) AS bucket,
    host,
    avg(cpu_usage) AS avg_cpu,
    max(cpu_usage) AS max_cpu,
    avg(memory_usage) AS avg_mem
  FROM metrics
  GROUP BY bucket, host;

-- Add an auto-refresh policy
SELECT add_continuous_aggregate_policy('metrics_hourly',
  start_offset => INTERVAL '3 hours',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour');

-- Query the aggregate (fast, reads pre-computed data)
SELECT * FROM metrics_hourly
WHERE bucket > now() - interval '7 days'
  AND host = 'server01'
ORDER BY bucket;

Compression#

ALTER TABLE metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'host',
  timescaledb.compress_orderby = 'time DESC'
);
SELECT add_compression_policy('metrics', INTERVAL '7 days');

Compression ratios of 10-20x are common for metrics data. Compressed chunks are read-only; inserts go to uncompressed chunks.

Query Performance Characteristics#

TimescaleDB leverages PostgreSQL’s query planner with chunk exclusion – queries with time-range filters only scan relevant chunks. Performance is excellent for SQL-heavy analytical queries and joins. It is slower than purpose-built engines for simple tag-based lookups at extremely high cardinality.

VictoriaMetrics#

Overview#

VictoriaMetrics is a high-performance, cost-effective time-series database that is wire-compatible with Prometheus. It accepts data via Prometheus remote_write, Graphite, InfluxDB line protocol, and OpenTSDB protocols. It uses its own storage engine optimized for time-series data with aggressive compression.

When to Choose VictoriaMetrics#

You need long-term storage for Prometheus metrics.
Storage efficiency matters – VictoriaMetrics typically uses 7-10x less disk than Prometheus.
You want a simpler alternative to Thanos or Cortex for Prometheus scaling.
You need to ingest metrics from multiple sources (Prometheus, Telegraf, Graphite).
You want MetricsQL, which is a PromQL superset with additional functions.

Deployment Modes#

Single-node – handles up to 1 million data points per second on modest hardware:

victoria-metrics \
  --storageDataPath=/var/lib/victoria-metrics \
  --retentionPeriod=12  # months
  --httpListenAddr=:8428

Cluster mode – for higher throughput, separate write (vminsert), storage (vmstorage), and query (vmselect) components:

# Storage nodes
vmstorage --storageDataPath=/data --retentionPeriod=12

# Insert nodes (distribute writes across storage)
vminsert --storageNode=storage1:8400,storage2:8400,storage3:8400

# Query nodes (query across all storage)
vmselect --storageNode=storage1:8401,storage2:8401,storage3:8401

Prometheus Integration#

# prometheus.yml -- point remote_write at VictoriaMetrics
remote_write:
  - url: http://victoriametrics:8428/api/v1/write

VictoriaMetrics serves as a drop-in Prometheus data source in Grafana. Point Grafana at http://victoriametrics:8428 with the Prometheus data source type.

Retention and Downsampling#

Retention is configured at startup with --retentionPeriod. Data expiration happens at the partition level without impacting performance. For downsampling, VictoriaMetrics Enterprise offers built-in rules. For the open-source version, use recording rules in vmalert or Prometheus.

Query Performance Characteristics#

VictoriaMetrics is optimized for PromQL-style queries: rate calculations, aggregations over time, and label-based filtering. It handles high-cardinality series better than Prometheus. It does not support SQL or relational joins.

Prometheus Long-Term Storage#

Prometheus’s local TSDB is designed for short retention (15 days to a few months). For longer retention, use an external solution.

Thanos – sidecar on each Prometheus uploads blocks to object storage (S3, GCS). A querier provides a unified query interface. The compactor handles downsampling:

# Thanos compactor with tiered retention
thanos compact \
  --data-dir=/tmp/compact \
  --objstore.config-file=bucket.yml \
  --retention.resolution-raw=30d \
  --retention.resolution-5m=180d \
  --retention.resolution-1h=365d

Cortex / Mimir – receives data via remote_write and stores in object storage. Horizontally scalable. Grafana Mimir is the successor to Cortex.

VictoriaMetrics – as described above, accepts remote_write with excellent compression and simpler operations than Thanos.

Decision Matrix#

Factor	InfluxDB	TimescaleDB	VictoriaMetrics	Thanos/Mimir
Best for	Purpose-built metrics/IoT	SQL + time-series hybrid	Prometheus long-term, high efficiency	Prometheus federation and long-term
Query language	SQL, InfluxQL, Flux	SQL (full PostgreSQL)	MetricsQL (PromQL superset)	PromQL
SQL joins	No	Yes (full SQL)	No	No
Storage efficiency	Good	Good (with compression)	Excellent (7-10x vs Prometheus)	Good (object storage)
Write throughput	High	High	Very high	Depends on backend
Built-in downsampling	Yes (tasks)	Yes (continuous aggregates)	Enterprise only	Yes (compactor)
Operational complexity	Medium	Low (it is PostgreSQL)	Low (single binary)	High (multiple components)
Existing Prometheus	Requires migration	Requires migration	Drop-in remote_write	Drop-in sidecar/remote_write

Choosing Based on Your Situation#

You already run PostgreSQL and want time-series capabilities: TimescaleDB. No new infrastructure, full SQL, continuous aggregates handle downsampling. The tradeoff is that it is not as write-efficient as purpose-built engines at extreme scale.

You run Prometheus and need longer retention: VictoriaMetrics for simplicity and storage efficiency. Thanos if you need the sidecar model (no remote_write changes) or Grafana Mimir if you want a managed-feeling Cortex successor.

You are building a new metrics or IoT platform from scratch: InfluxDB if you want a purpose-built solution with its own ecosystem. TimescaleDB if SQL interoperability matters.

You need to ingest millions of data points per second: VictoriaMetrics in cluster mode or InfluxDB Cloud (3.x).

You need to join time-series data with relational data: TimescaleDB. The others do not support relational joins.

Retention Strategy Principles#

Tiered retention: Keep raw data for days/weeks, downsampled data for months, heavily aggregated data for years.
Downsample before the retention boundary: Ensure aggregated data exists before raw data is deleted.
Align retention with query patterns: If nobody queries data older than 90 days at full resolution, downsample after 7 days.
Test retention policy changes in staging: Misconfigured retention can delete data you intended to keep.
Monitor storage growth: Time-series data grows linearly – set alerts on disk usage trends.