Service-to-Service Authentication and Authorization#
In a microservice architecture, services communicate over the network. Without authentication, any process that can reach a service can call it. Without authorization, any authenticated caller can do anything. Zero-trust networking assumes the internal network is hostile and requires every service-to-service call to be authenticated, authorized, and encrypted.
Mutual TLS (mTLS)#
Standard TLS has the client verify the server’s identity. Mutual TLS adds the reverse – the server also verifies the client’s identity. Both sides present certificates. This provides three things: encryption in transit, server authentication, and client authentication.
mTLS with Istio#
Istio handles mTLS transparently through Envoy sidecars. When both the source and destination pods are in the mesh, their sidecars negotiate mTLS automatically. The application code makes a plain HTTP call; the sidecar encrypts it.
PeerAuthentication controls mTLS enforcement:
# Enforce mTLS for all services in the namespace
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: default
namespace: payments
spec:
mtls:
mode: STRICTModes:
- PERMISSIVE: Accepts both mTLS and plaintext. Use during migration when some services are not yet in the mesh.
- STRICT: Only accepts mTLS connections. Plaintext calls are rejected. Use in production once all services are in the mesh.
- DISABLE: No mTLS. Rarely useful.
To enforce mesh-wide, apply PeerAuthentication in the istio-system namespace:
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICTPer-port exceptions for services that must accept plaintext (health checks from load balancers outside the mesh):
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: payments-api
namespace: payments
spec:
selector:
matchLabels:
app: payments-api
mtls:
mode: STRICT
portLevelMtls:
8081:
mode: PERMISSIVEmTLS with Linkerd#
Linkerd enables mTLS by default for all meshed services. There is no configuration to turn it on – if both sides have the Linkerd proxy, traffic is encrypted and mutually authenticated.
# Verify mTLS is active between services
linkerd viz edges -n payments
# Output shows if connections are secured
SRC DST SRC_NS DST_NS SECURED
frontend payments-api frontend payments TRUE
payments-api orders-db payments payments TRUELinkerd uses its own identity system based on the cluster’s trust anchor. Install with:
# Generate trust anchor
step certificate create root.linkerd.cluster.local ca.crt ca.key \
--profile root-ca --no-password --insecure
# Install with the trust anchor
linkerd install --identity-trust-anchors-file ca.crt | kubectl apply -f -
linkerd checkLinkerd automatically rotates leaf certificates every 24 hours. The issuer certificate rotation requires manual intervention or integration with cert-manager:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-identity-issuer
namespace: linkerd
spec:
secretName: linkerd-identity-issuer
duration: 48h
renewBefore: 25h
issuerRef:
name: linkerd-trust-anchor
kind: Issuer
commonName: identity.linkerd.cluster.local
dnsNames:
- identity.linkerd.cluster.local
isCA: true
privateKey:
algorithm: ECDSA
usages:
- cert sign
- crl sign
- server auth
- client authSPIFFE and SPIRE for Workload Identity#
SPIFFE (Secure Production Identity Framework for Everyone) defines a standard for identifying workloads across heterogeneous environments. A SPIFFE ID is a URI:
spiffe://trust-domain/pathFor example: spiffe://production.example.com/ns/payments/sa/payments-api
This identifies the workload by trust domain, namespace, and service account – independent of IP address, hostname, or network location.
SPIRE Architecture#
SPIRE (SPIFFE Runtime Environment) implements SPIFFE. It has two components:
- SPIRE Server: The central authority that issues SVIDs (SPIFFE Verifiable Identity Documents). It maintains the trust bundle and registration entries.
- SPIRE Agent: Runs on each node (as a DaemonSet in Kubernetes). Attests workloads, caches SVIDs, and exposes them via the Workload API.
# SPIRE Server deployment (simplified)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: spire-server
namespace: spire
spec:
replicas: 1
selector:
matchLabels:
app: spire-server
template:
spec:
containers:
- name: spire-server
image: ghcr.io/spiffe/spire-server:1.9
args:
- -config
- /run/spire/config/server.conf
volumeMounts:
- name: spire-config
mountPath: /run/spire/configWorkload Registration#
Register workloads with the SPIRE server so it knows which identity to issue to which pods:
# Register a workload by Kubernetes service account
spire-server entry create \
-spiffeID spiffe://example.com/ns/payments/sa/payments-api \
-parentID spiffe://example.com/spire/agent/k8s_psat/demo-cluster/node-1 \
-selector k8s:ns:payments \
-selector k8s:sa:payments-apiThe selectors match against Kubernetes metadata. When a pod in the payments namespace running as service account payments-api requests an identity from the SPIRE agent, it receives the registered SPIFFE ID as an X.509 certificate (X509-SVID) or JWT token (JWT-SVID).
SPIFFE with Istio#
Istio natively uses SPIFFE-compatible identities. Every sidecar gets a certificate with a SPIFFE ID derived from the pod’s service account:
spiffe://cluster.local/ns/payments/sa/payments-apiYou do not need to deploy SPIRE separately if you are using Istio. However, for multi-cluster or cross-platform identity (workloads outside Kubernetes), SPIRE provides federated trust across environments.
Authorization Policies#
Authentication answers “who is this?” Authorization answers “what can they do?” In a service mesh, authorization policies control which services can call which endpoints.
Istio Authorization Policy#
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: payments-api-access
namespace: payments
spec:
selector:
matchLabels:
app: payments-api
action: ALLOW
rules:
- from:
- source:
principals:
- "cluster.local/ns/frontend/sa/frontend-api"
- "cluster.local/ns/orders/sa/order-processor"
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/payments", "/api/payments/*"]
- from:
- source:
principals:
- "cluster.local/ns/monitoring/sa/prometheus"
to:
- operation:
methods: ["GET"]
paths: ["/metrics", "/healthz"]This policy says: only frontend-api and order-processor can call payment endpoints, and only Prometheus can scrape metrics. Everything else is denied.
Default deny: Apply a blanket deny policy to a namespace, then add specific ALLOW policies:
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: deny-all
namespace: payments
spec:
{}An empty spec with no rules denies everything. Layer ALLOW policies on top of this.
Linkerd Authorization Policy#
Linkerd uses Server and ServerAuthorization resources (or the newer HTTPRoute-based policies):
apiVersion: policy.linkerd.io/v1beta3
kind: Server
metadata:
name: payments-api
namespace: payments
spec:
podSelector:
matchLabels:
app: payments-api
port: 8080
proxyProtocol: HTTP/2
---
apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
name: frontend-to-payments
namespace: payments
spec:
server:
name: payments-api
client:
meshTLS:
serviceAccounts:
- name: frontend-api
namespace: frontendJWT Validation at the Mesh Level#
For requests originating from outside the mesh (external users, third-party services), validate JWTs at the mesh ingress rather than in each service. This centralizes token validation and prevents unauthenticated requests from reaching application code.
Istio RequestAuthentication#
apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
name: jwt-auth
namespace: istio-system
spec:
selector:
matchLabels:
istio: ingressgateway
jwtRules:
- issuer: "https://auth.example.com"
jwksUri: "https://auth.example.com/.well-known/jwks.json"
audiences:
- "api.example.com"
forwardOriginalToken: true
outputPayloadToHeader: "x-jwt-payload"RequestAuthentication validates the JWT but does not enforce it. A request without a JWT passes through. Pair it with an AuthorizationPolicy to require a valid token:
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: require-jwt
namespace: istio-system
spec:
selector:
matchLabels:
istio: ingressgateway
action: DENY
rules:
- from:
- source:
notRequestPrincipals: ["*"]
to:
- operation:
paths: ["/api/*"]
notPaths: ["/api/public/*", "/healthz"]This denies any request to /api/* that does not have a valid JWT (represented by notRequestPrincipals: ["*"]). Public endpoints and health checks are excluded.
Claims-Based Authorization#
Route or authorize based on JWT claims:
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: admin-only
namespace: payments
spec:
selector:
matchLabels:
app: payments-admin
action: ALLOW
rules:
- from:
- source:
requestPrincipals: ["https://auth.example.com/*"]
when:
- key: request.auth.claims[role]
values: ["admin", "super-admin"]Zero-Trust Implementation Checklist#
Zero-trust service communication means no implicit trust based on network location. Every call is verified. Implementing it fully requires layering the patterns above:
1. Encrypt all traffic: mTLS everywhere. Start with PERMISSIVE mode to avoid breaking existing services, then move to STRICT once all services are in the mesh. Verify with mesh tooling that no plaintext connections remain.
2. Authenticate every service: Each service has a cryptographic identity (SPIFFE ID or equivalent). No service communicates anonymously.
3. Authorize every call: Default deny policies in every namespace. Explicitly allow only the communication paths that should exist. Review authorization policies as part of service deployment.
4. Validate external tokens at the edge: JWT validation at the ingress gateway. Do not trust tokens that have not been validated by the mesh.
5. Rotate credentials automatically: Certificate lifetimes should be short (24 hours or less for leaf certs). Automate rotation with the mesh’s identity system or cert-manager.
6. Audit everything: Log all authorization decisions. In Istio, enable access logging and authorization policy decision logging:
apiVersion: telemetry.istio.io/v1
kind: Telemetry
metadata:
name: mesh-logging
namespace: istio-system
spec:
accessLogging:
- providers:
- name: envoy7. Segment the network: Even with mTLS and authorization policies, use Kubernetes NetworkPolicies as a defense-in-depth layer. If the mesh is misconfigured, network policies provide a fallback.
Zero-trust is not a product you install. It is an architecture you build by layering identity, authentication, authorization, encryption, and audit. The service mesh handles most of the heavy lifting, but it requires correct configuration and ongoing maintenance.