Why Monitor a POC Cluster#

Monitoring on minikube serves two purposes. First, it catches resource problems early – your app might work in tests but OOM-kill under load, and you will not know without metrics. Second, it validates that your monitoring configuration works before you deploy it to production. If your ServiceMonitors, dashboards, and alert rules work on minikube, they will work on EKS or GKE.

The Right Chart: kube-prometheus-stack#

There are multiple Prometheus-related Helm charts. Use the right one:

Chart Status Use It?
prometheus-community/kube-prometheus-stack Active, maintained Yes
stable/prometheus-operator Deprecated No
prometheus-community/prometheus Standalone Prometheus only Only if you do not want Grafana

The kube-prometheus-stack bundles Prometheus, Grafana, node-exporter, kube-state-metrics, and the Prometheus Operator. The Operator is what makes ServiceMonitors work – it watches for ServiceMonitor custom resources and automatically configures Prometheus scrape targets.

Namespace Strategy#

Always install monitoring into its own namespace:

kubectl create namespace monitoring

This isolates monitoring resources, makes RBAC cleaner, and lets you delete the entire stack without affecting your application namespaces.

Custom Values for Minikube#

Production defaults for kube-prometheus-stack request too many resources for minikube. Create a monitoring-values.yaml that reduces footprints and disables what you do not need:

# monitoring-values.yaml
prometheus:
  prometheusSpec:
    retention: 24h
    resources:
      requests:
        cpu: 200m
        memory: 512Mi
      limits:
        memory: 1Gi
    storageSpec:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 5Gi

grafana:
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      memory: 256Mi
  adminPassword: "admin"
  persistence:
    enabled: false

alertmanager:
  enabled: false

nodeExporter:
  resources:
    requests:
      cpu: 50m
      memory: 32Mi

kubeStateMetrics:
  resources:
    requests:
      cpu: 50m
      memory: 64Mi

Key decisions in this file:

  • Alertmanager disabled – For a POC, you do not need alert routing to Slack or PagerDuty. You will look at Grafana dashboards directly. Enable Alertmanager when you move to production and need automated incident response.
  • Prometheus retention set to 24h – Minikube does not need weeks of history. Short retention keeps disk usage low.
  • Grafana persistence disabled – Dashboard changes are lost on restart, but minikube clusters are ephemeral anyway. Import dashboards via config, not manual UI changes.
  • Storage enabled for Prometheus – Even on minikube, Prometheus should write to a PVC so it survives pod restarts during your development session.

Install Command#

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm upgrade --install monitoring prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  -f monitoring-values.yaml \
  --wait --timeout 300s

Verify everything is running:

kubectl get pods -n monitoring
# Should see: prometheus-monitoring-kube-prometheus-prometheus-0, monitoring-grafana-xxx,
#   monitoring-kube-state-metrics-xxx, monitoring-prometheus-node-exporter-xxx

Accessing Grafana#

kubectl port-forward svc/monitoring-grafana -n monitoring 3000:80
# Open http://localhost:3000
# Login: admin / admin

Three Dashboards That Actually Matter#

The kube-prometheus-stack ships with dozens of pre-built dashboards. Focus on these three:

1. Kubernetes / Compute Resources / Cluster – Shows CPU and memory usage across all nodes and namespaces. This is where you spot resource pressure before pods start getting evicted. On minikube, it tells you if you gave the cluster enough resources.

2. Kubernetes / Compute Resources / Pod – Per-pod CPU and memory over time. Use this to right-size your resource requests and limits. If a pod consistently uses 50Mi but you requested 512Mi, you are wasting cluster capacity.

3. Kubernetes / Networking / Pod – Network bytes received and transmitted per pod. This reveals unexpected traffic patterns – a background worker suddenly receiving inbound traffic, or an app making far more outbound calls than expected.

Find these in Grafana under Dashboards > Browse > Kubernetes.

ServiceMonitors: How Apps Expose Metrics#

A ServiceMonitor tells Prometheus “scrape this Service’s pods on this port and path.” Your application must expose a /metrics endpoint in Prometheus text format.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: my-app
  namespace: monitoring
  labels:
    release: monitoring
spec:
  namespaceSelector:
    matchNames:
      - default
  selector:
    matchLabels:
      app: my-app
  endpoints:
    - port: http
      path: /metrics
      interval: 15s

Critical detail: the release: monitoring label must match the Helm release name. The Prometheus Operator uses this label to discover which ServiceMonitors to load. If your monitors are not being picked up, this label mismatch is almost always the reason.

From POC to Production#

When you move this stack to a real cluster, change three things:

  1. Enable Alertmanager and configure receivers (Slack, PagerDuty, email).
  2. Increase Prometheus retention to 15-30 days, or add Thanos/Cortex for long-term storage.
  3. Enable Grafana persistence and use provisioned dashboards from ConfigMaps so they survive pod restarts.

The ServiceMonitors, scrape configs, and dashboards you built on minikube carry over unchanged.