Why Monitor a POC Cluster#
Monitoring on minikube serves two purposes. First, it catches resource problems early – your app might work in tests but OOM-kill under load, and you will not know without metrics. Second, it validates that your monitoring configuration works before you deploy it to production. If your ServiceMonitors, dashboards, and alert rules work on minikube, they will work on EKS or GKE.
The Right Chart: kube-prometheus-stack#
There are multiple Prometheus-related Helm charts. Use the right one:
| Chart | Status | Use It? |
|---|---|---|
prometheus-community/kube-prometheus-stack |
Active, maintained | Yes |
stable/prometheus-operator |
Deprecated | No |
prometheus-community/prometheus |
Standalone Prometheus only | Only if you do not want Grafana |
The kube-prometheus-stack bundles Prometheus, Grafana, node-exporter, kube-state-metrics, and the Prometheus Operator. The Operator is what makes ServiceMonitors work – it watches for ServiceMonitor custom resources and automatically configures Prometheus scrape targets.
Namespace Strategy#
Always install monitoring into its own namespace:
kubectl create namespace monitoringThis isolates monitoring resources, makes RBAC cleaner, and lets you delete the entire stack without affecting your application namespaces.
Custom Values for Minikube#
Production defaults for kube-prometheus-stack request too many resources for minikube. Create a monitoring-values.yaml that reduces footprints and disables what you do not need:
# monitoring-values.yaml
prometheus:
prometheusSpec:
retention: 24h
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
memory: 1Gi
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
grafana:
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
memory: 256Mi
adminPassword: "admin"
persistence:
enabled: false
alertmanager:
enabled: false
nodeExporter:
resources:
requests:
cpu: 50m
memory: 32Mi
kubeStateMetrics:
resources:
requests:
cpu: 50m
memory: 64MiKey decisions in this file:
- Alertmanager disabled – For a POC, you do not need alert routing to Slack or PagerDuty. You will look at Grafana dashboards directly. Enable Alertmanager when you move to production and need automated incident response.
- Prometheus retention set to 24h – Minikube does not need weeks of history. Short retention keeps disk usage low.
- Grafana persistence disabled – Dashboard changes are lost on restart, but minikube clusters are ephemeral anyway. Import dashboards via config, not manual UI changes.
- Storage enabled for Prometheus – Even on minikube, Prometheus should write to a PVC so it survives pod restarts during your development session.
Install Command#
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install monitoring prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
-f monitoring-values.yaml \
--wait --timeout 300sVerify everything is running:
kubectl get pods -n monitoring
# Should see: prometheus-monitoring-kube-prometheus-prometheus-0, monitoring-grafana-xxx,
# monitoring-kube-state-metrics-xxx, monitoring-prometheus-node-exporter-xxxAccessing Grafana#
kubectl port-forward svc/monitoring-grafana -n monitoring 3000:80
# Open http://localhost:3000
# Login: admin / adminThree Dashboards That Actually Matter#
The kube-prometheus-stack ships with dozens of pre-built dashboards. Focus on these three:
1. Kubernetes / Compute Resources / Cluster – Shows CPU and memory usage across all nodes and namespaces. This is where you spot resource pressure before pods start getting evicted. On minikube, it tells you if you gave the cluster enough resources.
2. Kubernetes / Compute Resources / Pod – Per-pod CPU and memory over time. Use this to right-size your resource requests and limits. If a pod consistently uses 50Mi but you requested 512Mi, you are wasting cluster capacity.
3. Kubernetes / Networking / Pod – Network bytes received and transmitted per pod. This reveals unexpected traffic patterns – a background worker suddenly receiving inbound traffic, or an app making far more outbound calls than expected.
Find these in Grafana under Dashboards > Browse > Kubernetes.
ServiceMonitors: How Apps Expose Metrics#
A ServiceMonitor tells Prometheus “scrape this Service’s pods on this port and path.” Your application must expose a /metrics endpoint in Prometheus text format.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: my-app
namespace: monitoring
labels:
release: monitoring
spec:
namespaceSelector:
matchNames:
- default
selector:
matchLabels:
app: my-app
endpoints:
- port: http
path: /metrics
interval: 15sCritical detail: the release: monitoring label must match the Helm release name. The Prometheus Operator uses this label to discover which ServiceMonitors to load. If your monitors are not being picked up, this label mismatch is almost always the reason.
From POC to Production#
When you move this stack to a real cluster, change three things:
- Enable Alertmanager and configure receivers (Slack, PagerDuty, email).
- Increase Prometheus retention to 15-30 days, or add Thanos/Cortex for long-term storage.
- Enable Grafana persistence and use provisioned dashboards from ConfigMaps so they survive pod restarts.
The ServiceMonitors, scrape configs, and dashboards you built on minikube carry over unchanged.