Kubernetes Operator Development#

Operators are custom controllers that manage CRDs. They encode operational knowledge – the kind of tasks a human operator would perform – into software that runs inside the cluster. An operator watches for changes to its custom resources and reconciles the actual state to match the desired state, creating, updating, or deleting child resources as needed.

Operator Maturity Model#

The Operator Framework defines five maturity levels:

Level	Capability	Example
1	Basic install	Helm operator deploys the application
2	Seamless upgrades	Operator handles version migrations
3	Full lifecycle	Backup, restore, failure recovery
4	Deep insights	Exposes metrics, fires alerts, generates dashboards
5	Auto-pilot	Auto-scaling, auto-healing, auto-tuning without human input

Most custom operators target Level 2-3. Levels 4-5 are typically reached by mature projects like the Prometheus Operator or Rook/Ceph.

Frameworks#

Kubebuilder (Go)#

The most widely used framework for production operators. Generates project scaffolding, RBAC manifests, Dockerfile, and test infrastructure using the controller-runtime library.

kubebuilder init --domain mycompany.io --repo github.com/mycompany/webapp-operator
kubebuilder create api --group apps --version v1 --kind WebApp

This generates:

api/v1/webapp_types.go – your CRD Go types
internal/controller/webapp_controller.go – your reconciliation logic
config/crd/ – CRD YAML (auto-generated from Go types)
config/rbac/ – RBAC roles for the operator

Operator SDK (Go/Ansible/Helm)#

Red Hat’s framework, built on Kubebuilder for Go operators, with additional support for Ansible and Helm-based operators. The Go path is functionally identical to Kubebuilder with some extra tooling for OLM (Operator Lifecycle Manager) integration.

Ansible and Helm operators are useful for wrapping existing automation without writing Go code, but they hit a ceiling at Level 2 maturity.

kopf (Python)#

A lightweight Python framework for operators. Good for teams without Go experience or for simpler operators:

import kopf
import kubernetes

@kopf.on.create('mycompany.io', 'v1', 'webapps')
def on_create(spec, name, namespace, **kwargs):
    api = kubernetes.client.AppsV1Api()
    deployment = build_deployment(name, namespace, spec)
    api.create_namespaced_deployment(namespace, deployment)
    return {'message': f'Deployment {name} created'}

@kopf.on.update('mycompany.io', 'v1', 'webapps')
def on_update(spec, name, namespace, **kwargs):
    api = kubernetes.client.AppsV1Api()
    deployment = build_deployment(name, namespace, spec)
    api.patch_namespaced_deployment(name, namespace, deployment)

Metacontroller#

Write controllers as webhooks in any language. Metacontroller handles the watch/cache/queue machinery, and calls your HTTP endpoint when reconciliation is needed. Good for polyglot teams, but adds an operational dependency (Metacontroller itself must be running).

Core Concepts: controller-runtime#

The Reconciler#

The reconciler is the central function. It receives the name of a resource that changed and reconciles the world to match:

type WebAppReconciler struct {
    client.Client
    Scheme *runtime.Scheme
}

func (r *WebAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := log.FromContext(ctx)

    // 1. Fetch the WebApp instance
    var webapp appsv1.WebApp
    if err := r.Get(ctx, req.NamespacedName, &webapp); err != nil {
        if apierrors.IsNotFound(err) {
            // Resource was deleted, nothing to do (finalizers handle cleanup)
            return ctrl.Result{}, nil
        }
        return ctrl.Result{}, err
    }

    // 2. Create or update the Deployment
    deployment := &appsv1.Deployment{}
    deployment.Name = webapp.Name
    deployment.Namespace = webapp.Namespace

    result, err := ctrl.CreateOrUpdate(ctx, r.Client, deployment, func() error {
        // Set the desired state
        deployment.Spec.Replicas = webapp.Spec.Replicas
        deployment.Spec.Template.Spec.Containers = []corev1.Container{
            {
                Name:  "app",
                Image: webapp.Spec.Image,
                Ports: []corev1.ContainerPort{{ContainerPort: 8080}},
                Resources: corev1.ResourceRequirements{
                    Requests: corev1.ResourceList{
                        corev1.ResourceCPU:    resource.MustParse(webapp.Spec.CPURequest),
                        corev1.ResourceMemory: resource.MustParse(webapp.Spec.MemoryRequest),
                    },
                },
            },
        }
        // Set labels for selector
        labels := map[string]string{"app": webapp.Name}
        deployment.Spec.Selector = &metav1.LabelSelector{MatchLabels: labels}
        deployment.Spec.Template.Labels = labels

        // Set owner reference for garbage collection
        return ctrl.SetControllerReference(&webapp, deployment, r.Scheme)
    })
    if err != nil {
        return ctrl.Result{}, err
    }
    log.Info("Deployment reconciled", "operation", result)

    // 3. Create or update the Service
    service := &corev1.Service{}
    service.Name = webapp.Name
    service.Namespace = webapp.Namespace

    _, err = ctrl.CreateOrUpdate(ctx, r.Client, service, func() error {
        service.Spec.Selector = map[string]string{"app": webapp.Name}
        service.Spec.Ports = []corev1.ServicePort{
            {Port: 80, TargetPort: intstr.FromInt(8080)},
        }
        return ctrl.SetControllerReference(&webapp, service, r.Scheme)
    })
    if err != nil {
        return ctrl.Result{}, err
    }

    // 4. Update status
    webapp.Status.AvailableReplicas = deployment.Status.AvailableReplicas
    webapp.Status.Ready = deployment.Status.AvailableReplicas == *webapp.Spec.Replicas
    if err := r.Status().Update(ctx, &webapp); err != nil {
        return ctrl.Result{}, err
    }

    return ctrl.Result{}, nil
}

Manager Setup#

The Manager runs the controller, handles leader election, serves health checks, and exposes metrics:

func main() {
    mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
        Scheme:                 scheme,
        MetricsBindAddress:     ":8080",
        HealthProbeBindAddress: ":8081",
        LeaderElection:         true,
        LeaderElectionID:       "webapp-operator-leader",
    })

    if err := (&WebAppReconciler{
        Client: mgr.GetClient(),
        Scheme: mgr.GetScheme(),
    }).SetupWithManager(mgr); err != nil {
        log.Fatal(err)
    }

    mgr.Start(ctrl.SetupSignalHandler())
}

Controller Setup#

func (r *WebAppReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&appsv1.WebApp{}).                          // watch WebApp resources
        Owns(&appsv1.Deployment{}).                     // also reconcile when owned Deployments change
        Owns(&corev1.Service{}).                        // and owned Services
        WithOptions(controller.Options{MaxConcurrentReconciles: 3}).
        Complete(r)
}

For() triggers reconciliation when a WebApp changes. Owns() triggers reconciliation when a child resource (with matching ownerReference) changes. This means if someone manually edits the Deployment, the operator immediately corrects the drift.

Reconciliation Patterns#

Idempotency#

Every reconcile call must be safe to repeat. ctrl.CreateOrUpdate handles this by fetching the existing resource and applying changes, creating it only if it does not exist. Never assume the current call is the first – the work queue can deliver the same resource multiple times.

Finalizers#

Finalizers let you clean up external resources (cloud infrastructure, DNS records, database entries) before a custom resource is deleted:

const finalizerName = "mycompany.io/cleanup"

func (r *WebAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    var webapp appsv1.WebApp
    if err := r.Get(ctx, req.NamespacedName, &webapp); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // Handle deletion
    if !webapp.DeletionTimestamp.IsZero() {
        if controllerutil.ContainsFinalizer(&webapp, finalizerName) {
            // Perform cleanup: delete DNS record, revoke certificates, etc.
            if err := r.cleanupExternalResources(ctx, &webapp); err != nil {
                return ctrl.Result{}, err
            }
            // Remove finalizer to allow deletion to proceed
            controllerutil.RemoveFinalizer(&webapp, finalizerName)
            if err := r.Update(ctx, &webapp); err != nil {
                return ctrl.Result{}, err
            }
        }
        return ctrl.Result{}, nil
    }

    // Add finalizer if not present
    if !controllerutil.ContainsFinalizer(&webapp, finalizerName) {
        controllerutil.AddFinalizer(&webapp, finalizerName)
        if err := r.Update(ctx, &webapp); err != nil {
            return ctrl.Result{}, err
        }
    }

    // Normal reconciliation...
    return ctrl.Result{}, nil
}

Event Recording#

Emit Kubernetes events so users can see what the operator is doing:

r.Recorder.Eventf(&webapp, corev1.EventTypeNormal, "Deployed",
    "Created deployment %s with %d replicas", webapp.Name, *webapp.Spec.Replicas)

r.Recorder.Eventf(&webapp, corev1.EventTypeWarning, "ScalingFailed",
    "Failed to scale deployment: %v", err)

These events appear in kubectl describe webapp my-app and are critical for debugging.

Error Handling#

controller-runtime uses the return value from Reconcile to decide what happens next:

// Success -- do not requeue
return ctrl.Result{}, nil

// Transient error -- requeue with exponential backoff
return ctrl.Result{}, fmt.Errorf("failed to create deployment: %w", err)

// Requeue after specific duration (e.g., check external resource status)
return ctrl.Result{RequeueAfter: 30 * time.Second}, nil

// Requeue immediately (use sparingly)
return ctrl.Result{Requeue: true}, nil

Returning an error triggers the work queue’s exponential backoff: 5ms, 10ms, 20ms, … up to a configurable maximum (default 1000 seconds). For expected transient failures (API rate limits, temporary network issues), returning an error with backoff is the correct pattern.

For operations that require polling (waiting for an external resource to become ready), use RequeueAfter with a reasonable interval instead of tight-looping.

Testing#

Unit Tests#

Mock the Kubernetes client and test reconciliation logic in isolation:

func TestReconcile_CreatesDeployment(t *testing.T) {
    webapp := &appsv1.WebApp{
        ObjectMeta: metav1.ObjectMeta{Name: "test-app", Namespace: "default"},
        Spec: appsv1.WebAppSpec{
            Image:    "nginx:latest",
            Replicas: pointer.Int32(3),
        },
    }

    fakeClient := fake.NewClientBuilder().
        WithScheme(scheme).
        WithObjects(webapp).
        Build()

    reconciler := &WebAppReconciler{Client: fakeClient, Scheme: scheme}
    _, err := reconciler.Reconcile(ctx, ctrl.Request{
        NamespacedName: types.NamespacedName{Name: "test-app", Namespace: "default"},
    })
    require.NoError(t, err)

    // Verify the Deployment was created
    var dep appsv1.Deployment
    err = fakeClient.Get(ctx, types.NamespacedName{Name: "test-app", Namespace: "default"}, &dep)
    require.NoError(t, err)
    assert.Equal(t, int32(3), *dep.Spec.Replicas)
}

Integration Tests with envtest#

envtest runs a real API server and etcd locally (no kubelet, no scheduler). It is the standard for integration testing Kubebuilder operators:

func TestMain(m *testing.M) {
    testEnv = &envtest.Environment{
        CRDDirectoryPaths: []string{filepath.Join("..", "..", "config", "crd", "bases")},
    }

    cfg, err := testEnv.Start()
    // ... setup manager, start reconciler ...

    code := m.Run()
    testEnv.Stop()
    os.Exit(code)
}

func TestWebApp_EndToEnd(t *testing.T) {
    webapp := &appsv1.WebApp{ /* ... */ }
    Expect(k8sClient.Create(ctx, webapp)).To(Succeed())

    // Wait for the reconciler to create the Deployment
    Eventually(func() bool {
        var dep appsv1.Deployment
        err := k8sClient.Get(ctx, types.NamespacedName{Name: "test-app", Namespace: "default"}, &dep)
        return err == nil
    }, timeout, interval).Should(BeTrue())
}

E2E Tests#

For full end-to-end testing, deploy the operator into a kind or minikube cluster and exercise it with real kubectl commands:

# Build and load the operator image
docker build -t webapp-operator:test .
kind load docker-image webapp-operator:test

# Deploy CRDs and operator
make install       # installs CRDs
make deploy IMG=webapp-operator:test

# Create a test resource and verify
kubectl apply -f config/samples/apps_v1_webapp.yaml
kubectl wait --for=condition=Ready webapp/sample --timeout=60s
kubectl get deployment sample  # should exist with correct replicas

Operational Concerns#

Leader election: always enable in production. Without it, running multiple operator replicas causes conflicting reconciliations. Leader election uses a Lease resource – only the leader reconciles.

RBAC: the operator’s ServiceAccount needs explicit permissions for every resource type it reads or writes. Kubebuilder generates RBAC from marker comments:

//+kubebuilder:rbac:groups=apps.mycompany.io,resources=webapps,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=apps.mycompany.io,resources=webapps/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups="",resources=services,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups="",resources=events,verbs=create;patch

Metrics: controller-runtime automatically exposes Prometheus metrics at /metrics including reconciliation duration, queue depth, and error counts. These are invaluable for monitoring operator health.

Common Gotchas#

Reconciliation storms. A bug that always returns an error or always requeues causes the controller to reconcile the same resource thousands of times per second, hammering the API server. Add rate limiting to the work queue and set MaxConcurrentReconciles to a reasonable value (2-5 for most operators).

Watch scope too broad. Watching all namespaces when you only operate in one wastes memory on the informer cache. Use cache.Options.DefaultNamespaces to restrict the watch scope:

mgr, err := ctrl.NewManager(cfg, ctrl.Options{
    Cache: cache.Options{
        DefaultNamespaces: map[string]cache.Config{
            "production": {},
        },
    },
})

Status update conflicts. If the reconciler reads a resource, does work, then updates status, another reconciliation might have modified the resource in between. Use r.Status().Update() (not r.Update()) to update only the status subresource, and handle Conflict errors by requeuing.

Not setting owner references. Without ownerReferences on child resources, deleting the parent CRD instance leaves orphaned Deployments, Services, and other resources behind. Always use ctrl.SetControllerReference() when creating child resources so Kubernetes garbage collection can clean up automatically.