Knative: Serverless on Kubernetes#
Knative brings serverless capabilities to any Kubernetes cluster. Unlike managed serverless platforms, you own the cluster – Knative adds autoscaling to zero, revision-based deployments, and event-driven invocation on top of standard Kubernetes primitives. This gives you the serverless developer experience without vendor lock-in.
Knative has two independent components: Serving (request-driven compute that scales to zero) and Eventing (event routing and delivery). You can install either or both.
Installing Knative#
Knative requires a networking layer. The two primary options are Istio (full service mesh, heavier) and Kourier (lightweight, Knative-specific).
# Install Knative Serving CRDs and core
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.0/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.0/serving-core.yaml
# Install Kourier as the networking layer
kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.14.0/kourier.yaml
# Configure Knative to use Kourier
kubectl patch configmap/config-network \
--namespace knative-serving \
--type merge \
--patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'
# Install Knative Eventing
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.0/eventing-crds.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.0/eventing-core.yamlVerify the installation:
kubectl get pods -n knative-serving
kubectl get pods -n knative-eventingKnative Serving#
Services#
A Knative Service is the primary resource. It manages the full lifecycle: creating a Configuration, which creates Revisions, which are backed by Kubernetes Deployments. You work with the Service; Knative handles everything underneath.
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: hello-app
namespace: default
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "0"
autoscaling.knative.dev/maxScale: "10"
autoscaling.knative.dev/target: "100"
spec:
containers:
- image: gcr.io/my-project/hello-app:v1
ports:
- containerPort: 8080
env:
- name: LOG_LEVEL
value: "info"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3Apply it and Knative creates everything:
kubectl apply -f hello-app.yaml
# Check the service
kubectl get ksvc hello-app
# Output includes the URL
# NAME URL LATESTCREATED LATESTREADY READY
# hello-app http://hello-app.default.example.com hello-app-00001 hello-app-00001 TrueRevisions#
Every change to the Service template creates a new Revision. Revisions are immutable snapshots of your configuration – the container image, environment variables, resource limits, and scaling annotations at the time of creation.
# List revisions
kubectl get revisions
# NAME CONFIG NAME GENERATION READY
# hello-app-00001 hello-app 1 True
# hello-app-00002 hello-app 2 TrueOld revisions are not deleted automatically. They remain available for traffic splitting and rollback. To clean them up, delete them explicitly or configure a retention policy.
Routes and Traffic Splitting#
Routes determine which revisions receive traffic and in what proportion. By default, 100% of traffic goes to the latest ready revision. You can split traffic across revisions for canary deployments or gradual rollouts.
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: hello-app
spec:
template:
metadata:
name: hello-app-v2
spec:
containers:
- image: gcr.io/my-project/hello-app:v2
traffic:
- revisionName: hello-app-v1
percent: 80
- revisionName: hello-app-v2
percent: 20This sends 80% of traffic to v1 and 20% to v2. To promote v2 fully:
traffic:
- revisionName: hello-app-v2
percent: 100Tag-based routing gives named URLs to specific revisions for testing before routing real traffic:
traffic:
- revisionName: hello-app-v2
percent: 0
tag: staging
- revisionName: hello-app-v1
percent: 100
tag: currentThis creates staging-hello-app.default.example.com pointing to v2 with zero production traffic. QA can validate the new version at that URL, then you adjust percentages to roll it out.
Autoscaling#
Knative uses the Knative Pod Autoscaler (KPA) by default, which supports scale-to-zero. The alternative is the Kubernetes Horizontal Pod Autoscaler (HPA), which does not scale to zero but supports CPU and memory-based scaling.
Key autoscaling annotations:
metadata:
annotations:
# Autoscaler class: kpa.autoscaling.knative.dev (default) or hpa.autoscaling.knative.dev
autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"
# Target concurrent requests per pod (default: 100)
autoscaling.knative.dev/target: "50"
# Metric: concurrency (default) or rps (requests per second)
autoscaling.knative.dev/metric: "concurrency"
# Scale bounds
autoscaling.knative.dev/minScale: "0"
autoscaling.knative.dev/maxScale: "20"
# Scale-to-zero grace period (how long to wait with zero traffic before scaling down)
autoscaling.knative.dev/scale-to-zero-pod-retention-period: "60s"
# Scale-down delay (prevents flapping)
autoscaling.knative.dev/scale-down-delay: "30s"
# Initial scale when coming from zero
autoscaling.knative.dev/initial-scale: "1"Scale-to-zero behavior: When no requests arrive for the configured grace period, Knative scales the deployment to zero replicas. The next request hits the activator (a Knative component that buffers requests), which triggers pod creation. The first request after scale-to-zero experiences a cold start – container pull, startup, and readiness probe passing.
To reduce cold start impact, set minScale: "1" for latency-critical services or use initial-scale to bring up multiple pods immediately when scaling from zero.
Global Autoscaling Configuration#
Cluster-wide defaults are set in the config-autoscaler ConfigMap:
kubectl edit configmap config-autoscaler -n knative-servingdata:
container-concurrency-target-default: "100"
enable-scale-to-zero: "true"
scale-to-zero-grace-period: "30s"
scale-to-zero-pod-retention-period: "0s"
stable-window: "60s"
panic-window-percentage: "10"
panic-threshold-percentage: "200"The panic mode kicks in when traffic suddenly spikes. If observed concurrency exceeds the panic threshold (2x the target by default) within the panic window, Knative scales aggressively using a shorter observation window to react faster.
Knative Eventing#
Eventing provides infrastructure for routing events from producers to consumers. The core abstractions are Sources (where events come from), Brokers (event routing hubs), and Triggers (subscriptions that filter and deliver events).
Brokers and Triggers#
A Broker is a named event bus within a namespace. Triggers filter events from the broker and route them to subscribers.
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
name: default
namespace: my-app
annotations:
eventing.knative.dev/broker.class: MTChannelBasedBrokerCreate triggers that subscribe to specific event types:
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: order-processor
namespace: my-app
spec:
broker: default
filter:
attributes:
type: com.myapp.order.created
source: /orders/api
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: order-processor
---
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: notification-sender
namespace: my-app
spec:
broker: default
filter:
attributes:
type: com.myapp.order.created
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: notification-senderBoth triggers fire on the same event type. The order-processor trigger additionally filters on the source attribute. Events are delivered as HTTP POST requests in CloudEvents format.
Event Sources#
Sources produce events and send them to a sink (typically a Broker or a Knative Service).
PingSource (cron-based):
apiVersion: sources.knative.dev/v1
kind: PingSource
metadata:
name: hourly-cleanup
spec:
schedule: "0 * * * *"
contentType: "application/json"
data: '{"action": "cleanup", "target": "expired-sessions"}'
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: defaultApiServerSource (Kubernetes events):
apiVersion: sources.knative.dev/v1
kind: ApiServerSource
metadata:
name: pod-events
spec:
serviceAccountName: pod-watcher
mode: Resource
resources:
- apiVersion: v1
kind: Pod
sink:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: pod-event-handlerKafkaSource (consume from Kafka topics):
apiVersion: sources.knative.dev/v1beta1
kind: KafkaSource
metadata:
name: order-events
spec:
consumerGroup: knative-consumer
bootstrapServers:
- kafka-bootstrap.kafka:9092
topics:
- orders
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: defaultCustom Domains#
By default, Knative generates URLs using {service}.{namespace}.{domain}. Configure your custom domain in the config-domain ConfigMap:
kubectl edit configmap config-domain -n knative-servingdata:
myapp.example.com: ""This makes all services in the cluster available under myapp.example.com. For per-namespace or per-service overrides:
data:
myapp.example.com: |
selector:
app: productionOnly services with the label app: production use this domain. Configure your DNS to point *.myapp.example.com to the Kourier or Istio ingress gateway’s external IP.
For HTTPS, configure the config-network ConfigMap with auto-tls: Enabled and install cert-manager with an appropriate ClusterIssuer. Knative will automatically provision TLS certificates for your services.
Practical Considerations#
When Knative makes sense: You have a Kubernetes cluster already. You want serverless scaling semantics (especially scale-to-zero) for some workloads. You need event-driven architecture with CloudEvents compatibility. You want to avoid vendor lock-in to a specific cloud’s serverless platform.
When Knative adds unnecessary complexity: You are running a single cloud provider and Lambda or Cloud Run meets your needs. You do not need scale-to-zero. Your team does not have Kubernetes expertise. The operational overhead of running Knative (CRDs, controllers, networking layer) is not justified by the workload.
Debugging tips:
# Check the Knative service status
kubectl get ksvc hello-app -o yaml | grep -A 20 status:
# Check the underlying pods
kubectl get pods -l serving.knative.dev/service=hello-app
# View activator logs (useful for scale-from-zero issues)
kubectl logs -n knative-serving -l app=activator -c activator
# View autoscaler logs
kubectl logs -n knative-serving -l app=autoscaler