Developer Self-Service Workflows

The Cost of Not Having Self-Service#

A developer needs a PostgreSQL database. They file a ticket. It sits in a backlog for two days. A DBA provisions it, sends credentials via Slack DM. Elapsed time: 3 days. Actual need: 5 minutes of configuration. Multiply across every database, cache, queue, and namespace, and manual provisioning becomes the single largest drag on velocity. Self-service lets developers provision pre-approved resources directly, within guardrails the platform team defines.

Infrastructure Request Automation#

The core pattern: developer declares what they want, automation provisions it, credentials are delivered programmatically. Three approaches dominate:

GitOps-driven: Developer opens a PR adding a resource definition. CI validates against policies. On merge, ArgoCD syncs and Crossplane provisions the infrastructure.

Backstage scaffolder: Developer fills a form, scaffolder generates the resource definition and commits to GitOps. Same provisioning backend, UI-guided frontend.

API-driven: Developer calls a platform API (REST or CLI). Works well for programmatic consumers like CI pipelines.

All three converge on declarative resource definitions reconciled by a controller.

Backstage Scaffolder for Self-Service#

The Backstage scaffolder turns self-service requests into multi-step workflows. A scaffolder template for provisioning a Redis cache:

apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: redis-cache
  title: Provision Redis Cache
  description: Self-service Redis cache with automatic credential injection
spec:
  owner: platform-team
  type: resource
  parameters:
    - title: Cache Configuration
      required: [name, owner, environment, size]
      properties:
        name:
          type: string
          pattern: '^[a-z][a-z0-9-]{2,24}$'
          description: Cache instance name
        owner:
          type: string
          ui:field: OwnerPicker
        environment:
          type: string
          enum: [development, staging, production]
        size:
          type: string
          enum: [small, medium, large]
          enumNames: ['Small (1GB)', 'Medium (4GB)', 'Large (16GB)']
  steps:
    - id: generate
      name: Generate Crossplane Claim
      action: fetch:template
      input:
        url: ./skeleton
        targetPath: infrastructure/redis/${{ parameters.name }}
        values:
          name: ${{ parameters.name }}
          owner: ${{ parameters.owner }}
          environment: ${{ parameters.environment }}
          size: ${{ parameters.size }}
    - id: pr
      name: Create Pull Request
      action: publish:github:pull-request
      input:
        repoUrl: github.com?owner=myorg&repo=infrastructure
        branchName: provision-redis-${{ parameters.name }}
        title: 'Provision Redis cache: ${{ parameters.name }}'
        description: |
          Self-service Redis provisioning for ${{ parameters.owner }}.
          Size: ${{ parameters.size }}, Environment: ${{ parameters.environment }}
    - id: register
      name: Register in Catalog
      action: catalog:register
      input:
        catalogInfoUrl: https://github.com/myorg/infrastructure/blob/main/infrastructure/redis/${{ parameters.name }}/catalog-info.yaml

The skeleton directory contains the Crossplane Claim template and a catalog-info.yaml for the resource. The PR is auto-approved by CI if policy checks pass (more on this below).

Crossplane Claims for Resource Provisioning#

Crossplane separates the developer-facing API (Claim) from the infrastructure-specific implementation (Composition). Developers interact only with Claims:

apiVersion: cache.platform.example.com/v1alpha1
kind: RedisInstance
metadata:
  name: session-cache
  namespace: team-identity
spec:
  parameters:
    size: medium
    version: "7"
    highAvailability: true
  compositionSelector:
    matchLabels:
      provider: aws
      environment: production
  writeConnectionSecretToRef:
    name: session-cache-credentials

The platform team maintains Compositions that map these claims to provider-specific resources:

apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: redis-aws-production
  labels:
    provider: aws
    environment: production
spec:
  compositeTypeRef:
    apiVersion: cache.platform.example.com/v1alpha1
    kind: XRedisInstance
  resources:
    - name: elasticache
      base:
        apiVersion: elasticache.aws.upbound.io/v1beta1
        kind: ReplicationGroup
        spec:
          forProvider:
            automaticFailoverEnabled: true
            engine: redis
            engineVersion: "7.0"
            nodeType: cache.r7g.large
            numCacheClusters: 3
            atRestEncryptionEnabled: true
            transitEncryptionEnabled: true

Developers never see the Composition. They interact with size, version, and highAvailability. The platform team controls instance types, encryption, and networking inside the Composition.

Self-Service Databases, Queues, and Caches#

A complete self-service resource catalog:

Resource	Claim API	Backend	Credential Delivery
PostgreSQL	`PostgreSQLInstance`	RDS via Crossplane	K8s Secret via ExternalSecrets
Redis	`RedisInstance`	ElastiCache via Crossplane	K8s Secret via ExternalSecrets
RabbitMQ	`MessageQueue`	CloudAMQP or RabbitMQ Operator	K8s Secret directly
S3 Bucket	`ObjectStore`	S3 via Crossplane	IRSA (IAM Roles for Service Accounts)
Kafka Topic	`EventStream`	MSK via Crossplane or Strimzi	K8s Secret + ACLs

Every resource type follows the same pattern: developer creates a Claim, the Composition provisions infrastructure, credentials are injected into the namespace as a Kubernetes Secret.

Guardrails Without Gates#

Guardrails enforce standards without blocking developers behind approval queues. The distinction: a gate requires a human to say yes. A guardrail automatically rejects non-compliant requests and tells the developer why, so they can fix and re-submit immediately.

Policy-as-code with OPA/Gatekeeper or Kyverno:

# Kyverno policy: enforce resource limits
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-limits
      match:
        any:
          - resources:
              kinds: [Deployment, StatefulSet]
      validate:
        message: "CPU and memory limits are required"
        pattern:
          spec:
            template:
              spec:
                containers:
                  - resources:
                      limits:
                        memory: "?*"
                        cpu: "?*"

Size-based guardrails: Crossplane Compositions validate parameters. A size: xlarge request is rejected at the Claim level: “Maximum allowed size is large. Contact platform-team for exceptions.”

Cost guardrails: Tag resources with team identifiers. Set per-team budgets. Alert when spend approaches the threshold — visibility and accountability without blocking.

Approval-Free Workflows#

The goal is to eliminate human approvals for standard operations. Here is what makes this safe:

Pre-approved resource definitions: The platform team pre-validates every option in the Claim API. If size: medium maps to a specific, vetted instance type, no approval is needed because the platform team already approved the configuration.
Policy enforcement in CI: PRs to the infrastructure repository are validated by OPA/Conftest before merge. Passing policy checks replaces human review for standard requests.
Auto-merge for policy-passing PRs: GitHub Actions can auto-merge PRs that pass all policy checks and were generated by the scaffolder:

- name: Auto-merge if policy passes
  if: github.actor == 'backstage-bot' && steps.policy.outcome == 'success'
  run: gh pr merge --auto --squash "${{ github.event.pull_request.number }}"
  env:
    GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Exception path for non-standard requests: Anything outside the pre-approved parameters (custom instance types, cross-account networking, compliance-sensitive resources) routes to a human review queue. This is the only path that requires approval.

The result: 90%+ of infrastructure requests provisioned in minutes with zero human involvement. The remaining non-standard requests get human review — where the platform team’s expertise is actually needed. If manual review exceeds 20%, your self-service catalog is missing common use cases.