Prometheus Architecture#
Prometheus pulls metrics from targets at regular intervals (scraping). Each target exposes an HTTP endpoint (typically /metrics) that returns metrics in a text format. Prometheus stores the scraped data in a local time-series database and evaluates alerting rules against it. Grafana connects to Prometheus as a data source and renders dashboards.
Scrape Configuration#
The core of Prometheus configuration is the scrape config. Each scrape_config block defines a set of targets and how to scrape them.
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "app"
metrics_path: /metrics
static_configs:
- targets: ["app:8080"]
labels:
env: "production"
- job_name: "node"
static_configs:
- targets: ["node-exporter:9100"]
- job_name: "postgres"
static_configs:
- targets: ["postgres-exporter:9187"]For dynamic environments, use service discovery instead of static configs. In Kubernetes, Prometheus discovers pods and services via the API server:
scrape_configs:
- job_name: "kubernetes-pods"
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_ip]
action: replace
target_label: __address__
regex: (.+);(.+)
replacement: $2:$1This scrapes any pod with the annotation prometheus.io/scrape: "true". The relabel configs extract the metrics path and port from pod annotations.
PromQL Essentials#
PromQL is Prometheus’s query language. Every query returns either an instant vector (one value per time series) or a range vector (multiple values over time).
# Current CPU usage rate per instance (last 5 minutes)
rate(node_cpu_seconds_total{mode!="idle"}[5m])
# Total HTTP requests per second by status code
sum by (status_code) (rate(http_requests_total[5m]))
# 95th percentile request latency
histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))
# Memory usage percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
# Disk space remaining percentage
(node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100
# Container CPU usage in a Kubernetes cluster
sum by (pod) (rate(container_cpu_usage_seconds_total{container!=""}[5m]))
# Container memory working set
sum by (pod) (container_memory_working_set_bytes{container!=""})Key functions: rate() computes per-second average, increase() gives total increase, sum by () aggregates across labels, histogram_quantile() computes percentiles from histogram buckets.
Alerting Rules#
Alerting rules are evaluated by Prometheus and fired when conditions hold for a specified duration. Alerts are sent to Alertmanager, which handles routing, deduplication, and notification.
# alert-rules.yml
groups:
- name: infrastructure
rules:
- alert: HighCPUUsage
expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 10m
labels:
severity: warning
annotations:
summary: "High CPU on {{ $labels.instance }}"
description: "CPU usage above 80% for 10 minutes. Current: {{ $value | printf \"%.1f\" }}%"
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 15
for: 5m
labels:
severity: critical
annotations:
summary: "Disk space low on {{ $labels.instance }}"
- alert: HighErrorRate
expr: sum by (job) (rate(http_requests_total{status_code=~"5.."}[5m])) / sum by (job) (rate(http_requests_total[5m])) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "Error rate above 5% for {{ $labels.job }}"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) * 60 * 15 > 3
for: 5m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is crash looping"The for duration prevents alerting on brief spikes. A condition must hold continuously for the entire duration before the alert fires.
Grafana Data Sources and Dashboards#
Connect Grafana to Prometheus by adding it as a data source. This can be provisioned declaratively:
# grafana/provisioning/datasources/prometheus.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: falseDashboard provisioning loads JSON dashboards from files on startup:
# grafana/provisioning/dashboards/dashboards.yml
apiVersion: 1
providers:
- name: default
orgId: 1
folder: ""
type: file
options:
path: /var/lib/grafana/dashboards
foldersFromFilesStructure: truePlace dashboard JSON files in the configured path. Export dashboards from the Grafana UI (Share > Export > Save to file) and commit them to version control for reproducibility.
USE and RED Methods#
Structure your monitoring around established methodologies.
USE method (for infrastructure resources – CPU, memory, disk, network):
- Utilization: What percentage of the resource is in use? (
node_cpu_seconds_total,node_memory_MemAvailable_bytes) - Saturation: Is work queuing? (
node_load1vs CPU count, swap usage) - Errors: Are there error conditions? (
node_disk_io_errors,node_network_receive_errs_total)
RED method (for request-driven services – APIs, web servers):
- Rate: Requests per second (
rate(http_requests_total[5m])) - Errors: Error rate or ratio (
rate(http_requests_total{status_code=~"5.."}[5m])) - Duration: Latency distribution (
histogram_quantile(0.95, ...))
kube-prometheus-stack for Kubernetes#
The kube-prometheus-stack Helm chart deploys Prometheus, Grafana, Alertmanager, node-exporter, and kube-state-metrics in one shot. It is the standard way to monitor a Kubernetes cluster.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install monitoring prometheus-community/kube-prometheus-stack \
--namespace monitoring --create-namespace \
--set grafana.adminPassword=admin \
--set prometheus.prometheusSpec.retention=30d \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50GiThis deploys Prometheus scraping the Kubernetes API, kubelet, node-exporter, and kube-state-metrics. Grafana comes with dashboards for cluster health, node resources, and pod workloads. Access Grafana with kubectl port-forward -n monitoring svc/monitoring-grafana 3000:80.
To add custom scrape targets, create a ServiceMonitor resource:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp
namespace: monitoring
labels:
release: monitoring # must match the Helm release label selector
spec:
namespaceSelector:
matchNames:
- default
selector:
matchLabels:
app: myapp
endpoints:
- port: metrics
interval: 15s
path: /metricsThe Prometheus Operator watches for ServiceMonitor resources and updates the scrape configuration automatically. The release: monitoring label is critical – without it, the Operator ignores the ServiceMonitor.