Cilium Deep Dive#

Cilium replaces the traditional Kubernetes networking stack with eBPF programs that run directly in the Linux kernel. Instead of kube-proxy translating Service definitions into iptables rules and a traditional CNI plugin managing pod networking through bridge interfaces and routing tables, Cilium attaches eBPF programs to kernel hooks that process packets at wire speed. The result is a networking layer that is faster at scale, capable of Layer 7 policy enforcement, and provides built-in observability without application instrumentation.

What eBPF Changes#

Traditional Kubernetes networking relies on iptables for Service routing (kube-proxy) and various bridge/routing configurations for pod-to-pod communication (CNI plugin). This works, but iptables has a fundamental scaling problem: rules are evaluated linearly. A cluster with 5,000 Services has thousands of iptables rules, and every packet traverses the chain. At scale, this adds measurable latency and makes rule updates slow (iptables locks the entire table during updates).

eBPF (extended Berkeley Packet Filter) programs are small programs that attach to kernel hooks – network device ingress and egress, socket operations, system calls. The kernel verifier checks them for safety before loading (no infinite loops, no out-of-bounds memory access). After verification, they are JIT-compiled to native machine code and run at near-native speed.

For Kubernetes networking, this means:

Service routing is performed by eBPF maps (hash tables) instead of iptables chains. A lookup in an eBPF map is O(1) regardless of the number of Services.
Connection tracking uses eBPF maps instead of the kernel conntrack table. Cilium manages its own connection state, avoiding conntrack table exhaustion under high connection rates.
Packet processing happens earlier in the kernel network stack. Cilium can short-circuit packets between pods on the same node without them traversing the full TCP/IP stack.

Installation#

Cilium is typically installed via Helm or the cilium CLI tool. It can also be deployed as a managed add-on on EKS, GKE, and AKS.

# Install with Helm (replacing kube-proxy)
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=<API_SERVER_IP> \
  --set k8sServicePort=<API_SERVER_PORT>

The kubeProxyReplacement=true setting tells Cilium to handle all Service routing, eliminating the need for kube-proxy entirely. You can then remove the kube-proxy DaemonSet. If you want to run Cilium alongside kube-proxy (for a gradual migration), set kubeProxyReplacement=false.

# Verify installation
cilium status
cilium connectivity test

The connectivity test runs a series of pods that verify pod-to-pod, pod-to-Service, pod-to-external, and network policy enforcement. Run it after installation to confirm everything works.

Network Policies#

Standard Kubernetes NetworkPolicy#

Cilium fully implements the standard Kubernetes NetworkPolicy API. If you have existing NetworkPolicy resources, they work unchanged.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - port: 8080
      protocol: TCP

CiliumNetworkPolicy (Extended)#

CiliumNetworkPolicy extends the standard with features that the Kubernetes NetworkPolicy API cannot express.

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-l7-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: frontend
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: GET
          path: "/api/v1/users"
        - method: GET
          path: "/api/v1/health"
        - method: POST
          path: "/api/v1/users"

This policy allows the frontend to make only specific HTTP requests to the API – GET and POST to specific paths. Any other HTTP method or path is denied. This is Layer 7 policy enforcement, something standard NetworkPolicy cannot do.

L7 Protocol Support#

Cilium can enforce policies at the application protocol level for:

HTTP: Match on method, path, and headers
gRPC: Match on service name and method name
Kafka: Match on topic, role (produce/consume), and client ID
DNS: Match on queried domain names

# Kafka policy: only allow consuming from specific topics
toPorts:
- ports:
  - port: "9092"
    protocol: TCP
  rules:
    kafka:
    - role: consume
      topic: "user-events"
    - role: consume
      topic: "order-events"

FQDN-Based Egress Policies#

One of Cilium’s most practical features for production workloads: controlling egress traffic by DNS name rather than IP address.

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-aws-egress
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api
  egress:
  - toFQDNs:
    - matchPattern: "*.amazonaws.com"
    - matchName: "api.stripe.com"
  - toEndpoints:
    - matchLabels:
        io.kubernetes.pod.namespace: kube-system
        k8s-app: kube-dns
    toPorts:
    - ports:
      - port: "53"
        protocol: UDP
      rules:
        dns:
        - matchPattern: "*"

This policy allows the API pods to make outbound connections only to AWS services and the Stripe API. All other egress is denied. The DNS rule at the bottom is required – Cilium intercepts DNS queries to learn the IP addresses that correspond to the allowed FQDNs. Without allowing DNS, the FQDN rules cannot function.

Transparent Encryption#

Cilium provides node-to-node encryption without sidecars, application changes, or certificate management.

WireGuard#

helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --set encryption.enabled=true \
  --set encryption.type=wireguard

WireGuard is the recommended option. It uses modern cryptography, has minimal performance overhead (typically 3-5% throughput reduction), and is built into the Linux kernel since version 5.6. Cilium automatically manages WireGuard keys for each node and establishes encrypted tunnels between all nodes in the cluster.

IPsec#

helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --set encryption.enabled=true \
  --set encryption.type=ipsec \
  --set encryption.ipsec.secretName=cilium-ipsec-keys

IPsec is the alternative for kernels that do not support WireGuard or for environments that require FIPS-compliant encryption. It has higher overhead than WireGuard.

With transparent encryption, all pod-to-pod traffic between nodes is encrypted. Traffic within the same node does not traverse the network and is not encrypted (it stays in kernel memory). This provides a simpler alternative to service mesh mTLS for the specific goal of encrypting traffic in transit between nodes.

Hubble Observability#

Hubble is Cilium’s built-in observability layer. It provides network flow logs, service dependency maps, and Prometheus metrics without any application instrumentation.

Enabling Hubble#

helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --set hubble.enabled=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set hubble.metrics.enabled="{dns,drop,tcp,flow,icmp,http}"

Hubble CLI#

# Install the Hubble CLI
cilium hubble enable

# Observe all flows in a namespace
hubble observe --namespace production

# Filter by source and destination
hubble observe --from-pod production/frontend --to-pod production/api

# Filter by verdict (forwarded, dropped, error)
hubble observe --verdict DROPPED

# Filter by HTTP status code
hubble observe --http-status-code 500

# Filter by DNS query
hubble observe --protocol DNS

Hubble observe shows real-time flow data: source pod, destination pod (or external IP), port, protocol, L7 information (HTTP path, DNS query, Kafka topic), and verdict (forwarded or dropped with the reason).

Hubble UI#

The Hubble UI provides a visual service dependency map. Each node is a service (identified by Kubernetes labels), and edges represent observed network flows. The UI shows:

Which services communicate with which other services
Request rates and error rates per edge
Dropped flows (network policy denials)
DNS queries and responses

Access it via port-forward:

kubectl port-forward -n kube-system svc/hubble-ui 12000:80

Hubble Metrics#

Hubble exports Prometheus metrics for network flows without requiring application-level instrumentation:

# HTTP request rate by source, destination, and status code
rate(hubble_http_requests_total{destination=~"production/.*"}[5m])

# DNS query rate and errors
rate(hubble_dns_queries_total[5m])
rate(hubble_dns_responses_total{rcode!="No Error"}[5m])

# Dropped packets by reason
rate(hubble_drop_total[5m])

# TCP connection statistics
rate(hubble_tcp_flags_total[5m])

These metrics provide service-level network observability that would otherwise require a service mesh sidecar or application-level instrumentation.

Cluster Mesh#

Cluster Mesh connects multiple Cilium-managed clusters, enabling cross-cluster service discovery, load balancing, and network policy enforcement.

Use Cases#

Multi-region high availability: Services in cluster A can failover to backends in cluster B
Shared services: A central cluster runs shared services (databases, message queues) accessible from workload clusters
Migration: Gradually move workloads between clusters while maintaining connectivity

Setup#

Each cluster needs a unique name and ID, and the Cluster Mesh API server must be accessible between clusters:

# Enable Cluster Mesh on each cluster
cilium clustermesh enable --context cluster1
cilium clustermesh enable --context cluster2

# Connect the clusters
cilium clustermesh connect --context cluster1 --destination-context cluster2

# Verify
cilium clustermesh status --context cluster1

Cross-Cluster Service Discovery#

Once connected, a Service with the same name and namespace in both clusters is automatically load-balanced across both clusters’ pods. To annotate a Service as global:

apiVersion: v1
kind: Service
metadata:
  name: api
  namespace: production
  annotations:
    service.cilium.io/global: "true"
    service.cilium.io/shared: "true"
spec:
  selector:
    app: api
  ports:
  - port: 8080

Pods in cluster1 connecting to api.production.svc.cluster.local will be load-balanced across api pods in both cluster1 and cluster2.

Global Network Policies#

CiliumNetworkPolicy resources can reference identities from remote clusters, allowing you to define network policies that span cluster boundaries.

Additional Features#

Bandwidth Manager#

eBPF-based rate limiting per pod, replacing traditional traffic control (tc) queueing disciplines:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: bandwidth-limit
spec:
  endpointSelector:
    matchLabels:
      app: data-processor
  egressDeny:
  - toPorts:
    - ports:
      - port: "0"

Bandwidth annotations on pods:

metadata:
  annotations:
    kubernetes.io/egress-bandwidth: "10M"
    kubernetes.io/ingress-bandwidth: "10M"

Host Firewall#

Cilium can apply network policies to the node itself, not just pods. This protects the kubelet, SSH, and other node-level services using the same policy framework:

apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: host-firewall
spec:
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: ""
  ingress:
  - fromEntities:
    - cluster
    toPorts:
    - ports:
      - port: "10250"
        protocol: TCP
  - fromCIDR:
    - "10.0.0.0/8"
    toPorts:
    - ports:
      - port: "22"
        protocol: TCP

Performance Characteristics#

Benchmarks consistently show that Cilium’s eBPF datapath outperforms iptables-based networking at scale. The advantage is most pronounced in clusters with many Services (>500) where iptables rule chain traversal becomes a bottleneck. For small clusters (<50 nodes, <200 Services), the performance difference is negligible.

Specific improvements:

Service routing latency: O(1) hash table lookup vs O(n) iptables chain traversal
Connection tracking: eBPF-managed conntrack scales better under high connection rates than the kernel conntrack table
Rule updates: Adding or removing a Service updates an eBPF map entry, not the entire iptables ruleset. No lock contention during updates.

When NOT to Use Cilium#

Simple clusters with minimal policy needs. If you have fewer than 50 nodes, do not need L7 policies, and do not need Hubble-level observability, Calico is simpler to operate. Calico has been around longer, has a larger knowledge base, and does standard L3/L4 networking well.

Old kernels. Cilium requires Linux kernel 4.19+ for basic functionality, 5.4+ for most features, and 5.10+ for all features (including WireGuard encryption and advanced eBPF features). Check your node kernel version before committing to Cilium. On managed Kubernetes services, verify the node OS and kernel version provided by default.

Minimal operational overhead tolerance. Cilium adds operational complexity: the cilium-agent DaemonSet, the cilium-operator Deployment, Hubble relay, and potentially the Cluster Mesh API server. Each is another component to monitor, upgrade, and debug.

Common Gotchas#

Kernel version too old. Cilium degrades gracefully when eBPF features are unavailable, but the degraded mode may not provide the features you need. Run cilium status after installation – it reports which features are active and which are disabled due to kernel limitations.

L7 policies deploy an Envoy proxy. When you create an L7 (HTTP, gRPC, Kafka) network policy, Cilium deploys an Envoy proxy sidecar (as part of the Cilium agent, not per-pod). This adds CPU and memory overhead on each node. If you only need L3/L4 policies, the Envoy proxy is not deployed and the overhead is lower.

DNS policy requires allowing DNS egress. FQDN-based policies work by intercepting DNS responses to learn IP-to-FQDN mappings. If your DNS egress rule is too restrictive or missing, the FQDN policy cannot resolve domain names and will deny all traffic to those destinations. Always include a rule allowing DNS traffic to kube-dns when using toFQDNs.

Hubble flow data retention is limited. Hubble stores flows in a ring buffer on each node. Under high traffic, older flows are evicted quickly. For long-term storage, export Hubble flows to an external system (Elasticsearch, S3, or a time-series database) using Hubble Exporter.

Cluster Mesh requires network reachability between clusters. The Cluster Mesh API server in each cluster must be reachable from the other clusters’ Cilium agents. In cloud environments, this typically means setting up VPC peering, transit gateways, or exposing the API server via a LoadBalancer Service. Pod CIDR ranges must not overlap between connected clusters.