Why a Service Catalog Exists#
A service catalog answers: “What do we have, who owns it, and what state is it in?” Without one, this information lives in tribal knowledge and stale wiki pages. When an incident hits at 3 AM, the on-call engineer needs to know who owns the failing service, what it depends on, and where to find the runbook. The catalog provides this in seconds.
The catalog is also the foundation for other platform capabilities. Golden paths register outputs in it. Scorecards evaluate catalog entities. Self-service workflows provision resources linked to catalog entries.
Backstage Catalog Model#
Backstage organizes everything into entities defined by a kind and a type. The core entity kinds:
Component: A piece of software — service, library, website. Each has a type, owner, lifecycle (experimental, production, deprecated), and links to source code, CI/CD, docs, and API definitions.
API: An interface exposed by a component. Specs can reference OpenAPI, AsyncAPI, gRPC protobuf, or GraphQL schemas. First-class entities because they represent contracts between teams.
Resource: Infrastructure a component depends on — databases, caches, queues, S3 buckets.
System: A logical grouping of components and resources providing a business capability. The “orders system” includes the orders API, worker, database, and event stream.
Domain: Top-level business grouping (“Commerce,” “Payments,” “Identity”). Domains contain systems.
Group and User: Teams and individuals that own entities, typically synced from your identity provider.
The hierarchy flows: Domain > System > Component/API/Resource, mapping to domain-driven design concepts.
The catalog-info.yaml File#
Every entity is defined by a YAML descriptor, conventionally placed at the repository root:
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: order-service
description: Handles order creation, updates, and fulfillment tracking
annotations:
github.com/project-slug: myorg/order-service
backstage.io/techdocs-ref: dir:.
argocd/app-name: order-service
pagerduty.com/service-id: P1234ABC
tags:
- go
- grpc
- postgresql
links:
- url: https://grafana.internal/d/order-service
title: Grafana Dashboard
icon: dashboard
spec:
type: service
lifecycle: production
owner: team-commerce
system: orders
providesApis:
- order-api
consumesApis:
- inventory-api
- payment-api
dependsOn:
- resource:orders-db
- resource:orders-cacheAnnotations drive plugin behavior: github.com/project-slug shows PRs and CI status, argocd/app-name shows deployment status, pagerduty.com/service-id shows on-call info.
An API entity alongside the component:
apiVersion: backstage.io/v1alpha1
kind: API
metadata:
name: order-api
description: REST API for order management
spec:
type: openapi
lifecycle: production
owner: team-commerce
system: orders
definition:
$text: ./api/openapi.yamlThe $text substitution pulls the OpenAPI spec from a file in the same repository. Backstage renders it as interactive API documentation.
Auto-Discovery#
Manually registering every repository is impractical at scale. Backstage supports several discovery mechanisms:
GitHub discovery scans organizations for catalog-info.yaml files:
# app-config.yaml
catalog:
providers:
github:
myorg:
organization: myorg
catalogPath: /catalog-info.yaml
filters:
repository: '.*'
schedule:
frequency: { minutes: 30 }
timeout: { minutes: 3 }This scans every repository in myorg every 30 minutes and registers entities found in catalog-info.yaml. GitLab discovery works similarly. Kubernetes discovery can auto-register workloads but produces lower-quality entries since K8s manifests lack semantic metadata.
Recommended approach: require catalog-info.yaml in every repository via golden paths, then use discovery to auto-register. Repositories without one show up as “unregistered” in governance dashboards.
Scorecards and Maturity Tracking#
Scorecards evaluate catalog entities against a set of standards and show a maturity score. They answer “how production-ready is this service?” with concrete checks rather than opinions.
Example scorecard criteria for a production service:
| Check | Criteria | Points |
|---|---|---|
| Has owner | spec.owner is set and maps to an active team |
10 |
| Has description | metadata.description is non-empty and > 20 characters |
5 |
| TechDocs exist | backstage.io/techdocs-ref annotation present, docs build successfully |
10 |
| CI pipeline passes | Last GitHub Actions run on main is green | 10 |
| Has on-call | PagerDuty service linked and has an escalation policy | 15 |
| API spec defined | spec.providesApis is non-empty with valid API entities |
10 |
| Runs on Kubernetes | ArgoCD annotation present, app is synced and healthy | 10 |
| Has resource limits | Kubernetes Deployment has CPU and memory limits set | 10 |
| Dependency tracking | spec.dependsOn lists all consumed resources |
10 |
| Recent deploy | Last deployment was within 30 days | 10 |
Tools like Spotify’s Soundcheck, Cortex, and OpsLevel implement scorecards. You can also build custom scorecards with a cron job evaluating checks against the catalog API.
Maturity levels derived from scores:
- Bronze (0-40 points): Basic registration only. Missing critical production readiness criteria.
- Silver (41-70 points): Has ownership, documentation, and CI. Missing some operational maturity.
- Gold (71-90 points): Production-ready. Has on-call, monitoring, and API definitions.
- Platinum (91-100 points): Fully mature. All checks pass.
Display maturity levels prominently in the catalog UI. Teams naturally compete to improve their scores when the data is visible.
Ownership Enforcement#
Ownership is the single most important catalog field. Without clear ownership, incidents escalate slowly, tech debt accumulates invisibly, and services become orphans.
Enforcement strategies:
Block unowned entities: A CI check on catalog-info.yaml changes rejects any entity where spec.owner is empty or refers to a non-existent group. This is a hard gate.
Orphan detection: Weekly report listing entities whose owning team has been dissolved or has zero members. These need re-assignment.
Ownership transfer: When a team is reorganized, run a script identifying all entities owned by the old team and open PRs to transfer ownership. This is not optional.
Ownership validation in Backstage:
catalog:
rules:
- allow: [Component, API, Resource, System]
processors:
- type: owner-validator
config:
requireOwner: true
allowedOwnerKinds: [Group]Tech Debt Visibility#
The catalog surfaces tech debt where engineers already look. Concrete approaches:
Deprecation tracking: Set spec.lifecycle: deprecated on components that should be migrated away from. Track how many services still consume deprecated APIs.
Dependency age: Annotate components with framework/runtime versions. “47 services on Go 1.20 (EOL), 12 on Go 1.22 (current)” makes tech debt visible at the portfolio level.
Security findings: Integrate Snyk, Trivy, or Dependabot results into catalog entity pages alongside deployment status and on-call info.
Custom tech debt tags: Let teams tag entities with labels (needs-migration, legacy-auth, manual-deploy). Aggregate into a dashboard by team and severity to create organizational pressure without mandating timelines.