Knative: Serverless on Kubernetes#

Knative brings serverless capabilities to any Kubernetes cluster. Unlike managed serverless platforms, you own the cluster – Knative adds autoscaling to zero, revision-based deployments, and event-driven invocation on top of standard Kubernetes primitives. This gives you the serverless developer experience without vendor lock-in.

Knative has two independent components: Serving (request-driven compute that scales to zero) and Eventing (event routing and delivery). You can install either or both.

Installing Knative#

Knative requires a networking layer. The two primary options are Istio (full service mesh, heavier) and Kourier (lightweight, Knative-specific).

# Install Knative Serving CRDs and core
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.0/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.0/serving-core.yaml

# Install Kourier as the networking layer
kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.14.0/kourier.yaml

# Configure Knative to use Kourier
kubectl patch configmap/config-network \
  --namespace knative-serving \
  --type merge \
  --patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'

# Install Knative Eventing
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.0/eventing-crds.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.0/eventing-core.yaml

Verify the installation:

kubectl get pods -n knative-serving
kubectl get pods -n knative-eventing

Knative Serving#

Services#

A Knative Service is the primary resource. It manages the full lifecycle: creating a Configuration, which creates Revisions, which are backed by Kubernetes Deployments. You work with the Service; Knative handles everything underneath.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-app
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "0"
        autoscaling.knative.dev/maxScale: "10"
        autoscaling.knative.dev/target: "100"
    spec:
      containers:
      - image: gcr.io/my-project/hello-app:v1
        ports:
        - containerPort: 8080
        env:
        - name: LOG_LEVEL
          value: "info"
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 256Mi
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 3

Apply it and Knative creates everything:

kubectl apply -f hello-app.yaml

# Check the service
kubectl get ksvc hello-app

# Output includes the URL
# NAME        URL                                    LATESTCREATED     LATESTREADY       READY
# hello-app   http://hello-app.default.example.com   hello-app-00001   hello-app-00001   True

Revisions#

Every change to the Service template creates a new Revision. Revisions are immutable snapshots of your configuration – the container image, environment variables, resource limits, and scaling annotations at the time of creation.

# List revisions
kubectl get revisions

# NAME              CONFIG NAME   GENERATION   READY
# hello-app-00001   hello-app     1            True
# hello-app-00002   hello-app     2            True

Old revisions are not deleted automatically. They remain available for traffic splitting and rollback. To clean them up, delete them explicitly or configure a retention policy.

Routes and Traffic Splitting#

Routes determine which revisions receive traffic and in what proportion. By default, 100% of traffic goes to the latest ready revision. You can split traffic across revisions for canary deployments or gradual rollouts.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-app
spec:
  template:
    metadata:
      name: hello-app-v2
    spec:
      containers:
      - image: gcr.io/my-project/hello-app:v2
  traffic:
  - revisionName: hello-app-v1
    percent: 80
  - revisionName: hello-app-v2
    percent: 20

This sends 80% of traffic to v1 and 20% to v2. To promote v2 fully:

  traffic:
  - revisionName: hello-app-v2
    percent: 100

Tag-based routing gives named URLs to specific revisions for testing before routing real traffic:

  traffic:
  - revisionName: hello-app-v2
    percent: 0
    tag: staging
  - revisionName: hello-app-v1
    percent: 100
    tag: current

This creates staging-hello-app.default.example.com pointing to v2 with zero production traffic. QA can validate the new version at that URL, then you adjust percentages to roll it out.

Autoscaling#

Knative uses the Knative Pod Autoscaler (KPA) by default, which supports scale-to-zero. The alternative is the Kubernetes Horizontal Pod Autoscaler (HPA), which does not scale to zero but supports CPU and memory-based scaling.

Key autoscaling annotations:

metadata:
  annotations:
    # Autoscaler class: kpa.autoscaling.knative.dev (default) or hpa.autoscaling.knative.dev
    autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"

    # Target concurrent requests per pod (default: 100)
    autoscaling.knative.dev/target: "50"

    # Metric: concurrency (default) or rps (requests per second)
    autoscaling.knative.dev/metric: "concurrency"

    # Scale bounds
    autoscaling.knative.dev/minScale: "0"
    autoscaling.knative.dev/maxScale: "20"

    # Scale-to-zero grace period (how long to wait with zero traffic before scaling down)
    autoscaling.knative.dev/scale-to-zero-pod-retention-period: "60s"

    # Scale-down delay (prevents flapping)
    autoscaling.knative.dev/scale-down-delay: "30s"

    # Initial scale when coming from zero
    autoscaling.knative.dev/initial-scale: "1"

Scale-to-zero behavior: When no requests arrive for the configured grace period, Knative scales the deployment to zero replicas. The next request hits the activator (a Knative component that buffers requests), which triggers pod creation. The first request after scale-to-zero experiences a cold start – container pull, startup, and readiness probe passing.

To reduce cold start impact, set minScale: "1" for latency-critical services or use initial-scale to bring up multiple pods immediately when scaling from zero.

Global Autoscaling Configuration#

Cluster-wide defaults are set in the config-autoscaler ConfigMap:

kubectl edit configmap config-autoscaler -n knative-serving
data:
  container-concurrency-target-default: "100"
  enable-scale-to-zero: "true"
  scale-to-zero-grace-period: "30s"
  scale-to-zero-pod-retention-period: "0s"
  stable-window: "60s"
  panic-window-percentage: "10"
  panic-threshold-percentage: "200"

The panic mode kicks in when traffic suddenly spikes. If observed concurrency exceeds the panic threshold (2x the target by default) within the panic window, Knative scales aggressively using a shorter observation window to react faster.

Knative Eventing#

Eventing provides infrastructure for routing events from producers to consumers. The core abstractions are Sources (where events come from), Brokers (event routing hubs), and Triggers (subscriptions that filter and deliver events).

Brokers and Triggers#

A Broker is a named event bus within a namespace. Triggers filter events from the broker and route them to subscribers.

apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
  name: default
  namespace: my-app
  annotations:
    eventing.knative.dev/broker.class: MTChannelBasedBroker

Create triggers that subscribe to specific event types:

apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: order-processor
  namespace: my-app
spec:
  broker: default
  filter:
    attributes:
      type: com.myapp.order.created
      source: /orders/api
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: order-processor
---
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: notification-sender
  namespace: my-app
spec:
  broker: default
  filter:
    attributes:
      type: com.myapp.order.created
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: notification-sender

Both triggers fire on the same event type. The order-processor trigger additionally filters on the source attribute. Events are delivered as HTTP POST requests in CloudEvents format.

Event Sources#

Sources produce events and send them to a sink (typically a Broker or a Knative Service).

PingSource (cron-based):

apiVersion: sources.knative.dev/v1
kind: PingSource
metadata:
  name: hourly-cleanup
spec:
  schedule: "0 * * * *"
  contentType: "application/json"
  data: '{"action": "cleanup", "target": "expired-sessions"}'
  sink:
    ref:
      apiVersion: eventing.knative.dev/v1
      kind: Broker
      name: default

ApiServerSource (Kubernetes events):

apiVersion: sources.knative.dev/v1
kind: ApiServerSource
metadata:
  name: pod-events
spec:
  serviceAccountName: pod-watcher
  mode: Resource
  resources:
  - apiVersion: v1
    kind: Pod
  sink:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: pod-event-handler

KafkaSource (consume from Kafka topics):

apiVersion: sources.knative.dev/v1beta1
kind: KafkaSource
metadata:
  name: order-events
spec:
  consumerGroup: knative-consumer
  bootstrapServers:
  - kafka-bootstrap.kafka:9092
  topics:
  - orders
  sink:
    ref:
      apiVersion: eventing.knative.dev/v1
      kind: Broker
      name: default

Custom Domains#

By default, Knative generates URLs using {service}.{namespace}.{domain}. Configure your custom domain in the config-domain ConfigMap:

kubectl edit configmap config-domain -n knative-serving
data:
  myapp.example.com: ""

This makes all services in the cluster available under myapp.example.com. For per-namespace or per-service overrides:

data:
  myapp.example.com: |
    selector:
      app: production

Only services with the label app: production use this domain. Configure your DNS to point *.myapp.example.com to the Kourier or Istio ingress gateway’s external IP.

For HTTPS, configure the config-network ConfigMap with auto-tls: Enabled and install cert-manager with an appropriate ClusterIssuer. Knative will automatically provision TLS certificates for your services.

Practical Considerations#

When Knative makes sense: You have a Kubernetes cluster already. You want serverless scaling semantics (especially scale-to-zero) for some workloads. You need event-driven architecture with CloudEvents compatibility. You want to avoid vendor lock-in to a specific cloud’s serverless platform.

When Knative adds unnecessary complexity: You are running a single cloud provider and Lambda or Cloud Run meets your needs. You do not need scale-to-zero. Your team does not have Kubernetes expertise. The operational overhead of running Knative (CRDs, controllers, networking layer) is not justified by the workload.

Debugging tips:

# Check the Knative service status
kubectl get ksvc hello-app -o yaml | grep -A 20 status:

# Check the underlying pods
kubectl get pods -l serving.knative.dev/service=hello-app

# View activator logs (useful for scale-from-zero issues)
kubectl logs -n knative-serving -l app=activator -c activator

# View autoscaler logs
kubectl logs -n knative-serving -l app=autoscaler