Release Management Patterns#
Releasing software is more than merging to main and deploying. A disciplined release process ensures that every version is identifiable, every change is documented, every deployment is reversible, and failures are contained before they reach all users. This operational sequence walks through each phase of a production release workflow.
Phase 1 – Semantic Versioning#
Step 1: Adopt Semantic Versioning#
Semantic versioning (semver) communicates the impact of changes through the version number itself: MAJOR.MINOR.PATCH.
- MAJOR (1.0.0 to 2.0.0): Breaking changes. API contracts changed. Consumers must update their code.
- MINOR (1.1.0 to 1.2.0): New features, backward compatible. Existing consumers are unaffected.
- PATCH (1.2.0 to 1.2.1): Bug fixes, backward compatible. No new functionality.
Pre-release versions append a hyphen suffix: 2.0.0-rc.1, 2.0.0-beta.3. Build metadata uses a plus suffix: 1.2.3+build.456. Pre-release versions sort before their release counterpart (2.0.0-rc.1 < 2.0.0).
Step 2: Enforce Conventional Commits#
Conventional Commits provide a structured commit message format that automation tools parse to determine the version bump:
feat: add user export to CSV
BREAKING CHANGE: export endpoint now requires authenticationThe format is type(optional-scope): description. Key types: feat (MINOR bump), fix (PATCH bump), docs, chore, refactor, perf, test, ci. A BREAKING CHANGE: footer or ! after the type (like feat!:) triggers a MAJOR bump.
Enforce the format with commitlint in CI using the wagoid/commitlint-github-action or equivalent. Configure it to extend @commitlint/config-conventional.
Step 3: Automate Version Determination#
Tools like semantic-release read the commit history since the last tag, determine the appropriate version bump, and create the release. Configure it with plugins for commit analysis, release notes generation, changelog updates, Git commits, and GitHub Release creation. The version number is determined entirely by the commit messages – no human decides whether it is a major, minor, or patch release. For Go projects, use go-semantic-release which operates directly on Git tags.
Phase 2 – Changelog Generation#
Step 4: Generate Changelogs Automatically#
The changelog should be generated from commit history, not written by hand. Manual changelogs drift from reality and consume engineering time.
semantic-release generates changelogs as part of the release process. For standalone changelog generation, git-cliff parses conventional commits and groups them by type (features, bug fixes, performance) using a cliff.toml configuration. Run git-cliff --unreleased --prepend CHANGELOG.md to add unreleased changes, or git-cliff -o CHANGELOG.md to regenerate the full changelog.
Step 5: Attach Release Notes to Tags#
Release notes should appear where consumers look: GitHub/GitLab Releases, not buried in a changelog file. Create annotated tags with git tag -a v1.5.0 -m "Release notes here", or automate with git-cliff --latest --strip header | gh release create v1.5.0 --title "v1.5.0" --notes-file -.
Phase 3 – Release Branching#
Step 6: Choose a Branching Strategy#
Trunk-based development with release tags (recommended for most teams): All development happens on main. Releases are created by tagging commits on main. No long-lived release branches. Hotfixes merge to main and are cherry-picked or the next release includes them.
main: A--B--C--D--E--F--G--H
^ ^ ^
v1.0.0 v1.1.0 v1.1.1Release branches (for teams supporting multiple versions): Create a branch when you need to maintain an older version while main advances:
main: A--B--C--D--E--F--G--H--I
\ \
release/1.x: C'--D' (v1.0.1, v1.0.2)
\
release/2.x: G'--H' (v2.0.0, v2.0.1)Step 7: Implement Release Branches When Needed#
Create the branch from the release tag, not from main HEAD: git checkout -b release/2.x v2.0.0. Cherry-pick hotfixes from main into the release branch (never the reverse). Only bug fixes go to release branches – never backport features. Delete release branches when the version reaches end-of-life.
Phase 4 – Release Validation Gates#
Step 8: Define Pre-Release Gates#
Before any release reaches production, it must pass through validation gates. Structure your release pipeline as a sequence of gates that must all pass:
- Unit tests – run the full test suite against the tagged commit.
- Integration tests – verify service interactions and database migrations.
- Security scan – scan source and container image for CRITICAL/HIGH vulnerabilities (Trivy, Snyk, or equivalent).
- Image build – build and push the container image tagged with the version.
- Deploy to staging – depends on all four gates above passing.
- Smoke tests – validate the staging deployment with automated endpoint checks.
- Deploy to production – depends on smoke tests. Use GitHub Actions
environmentwith required reviewers for manual approval gates.
Step 9: Implement Smoke Tests#
Smoke tests run against the deployed staging environment to verify basic functionality. They should be fast (under 2 minutes) and cover critical paths: health checks, authentication, core API endpoints, and database connectivity. They are not comprehensive test suites – they verify the deployment did not fundamentally break the application. Write a simple script that curls each critical endpoint and fails on unexpected status codes.
Phase 5 – Rollback Procedures#
Step 10: Define Rollback Mechanisms#
Every deployment must have a documented rollback path. The method depends on your deployment tool:
Kubernetes native: kubectl rollout undo deployment/myapp -n production rolls back to the previous revision. Add --to-revision=3 to target a specific revision. Use kubectl rollout history to view available revisions.
Helm: helm rollback myapp -n production rolls back to the previous release. Specify a revision number to target a specific release. Use helm history to view available releases.
ArgoCD: argocd app rollback myapp <history-id> or sync to a specific tag with argocd app sync myapp --revision v1.4.2.
Step 11: Establish Rollback Decision Criteria#
Define clear, measurable criteria that trigger a rollback – do not leave this to judgment under pressure. Examples: error rate exceeds 1% for more than 2 minutes, P99 latency increases by more than 50%, health checks return non-200 for more than 30 seconds, or any critical alert fires. Encode these as monitoring alerts. When a deployment alert fires, the on-call engineer’s first action is rollback, not debugging.
Step 12: Test Rollbacks Regularly#
An untested rollback is not a rollback plan. Include rollback testing in your release process: deploy the new version to staging, immediately roll back, verify the rollback completes, then re-deploy. This confirms the mechanism works before you need it under incident pressure.
Phase 6 – Progressive Rollouts#
Step 13: Implement Canary Deployments#
Progressive rollouts limit the blast radius of a bad release. Instead of routing all traffic to the new version immediately, route a small percentage first and increase gradually. Using Argo Rollouts, define a canary strategy with weight steps:
strategy:
canary:
steps:
- setWeight: 10
- pause: { duration: 5m }
- setWeight: 30
- pause: { duration: 5m }
- setWeight: 60
- pause: { duration: 5m }
- setWeight: 100This sends 10% of traffic to the new version, waits 5 minutes, increases to 30%, and so on. If a problem is detected at any pause step, the rollout aborts and traffic shifts back to the stable version.
Step 14: Add Automated Analysis to Canary Steps#
Combine canary deployments with Argo Rollouts AnalysisTemplates to automate rollout decisions. Define an AnalysisTemplate that queries Prometheus for the success rate (e.g., successCondition: result[0] > 0.99). Insert analysis steps between weight increases in the canary strategy. If the success rate drops below the threshold, the rollout automatically aborts and rolls back without human intervention.
Step 15: Blue-Green for Zero-Downtime Switches#
Blue-green deployments maintain two complete environments. The new version deploys behind a preview service. Run validation against the preview. When satisfied, promote to switch the active service to the new version. Set scaleDownDelaySeconds (e.g., 300) to keep old pods running as a fast rollback option – if anything goes wrong in the first minutes, rolling back is instantaneous because the old pods are still available.
Common Mistakes#
- Manually determining version numbers. Humans forget, disagree, and make mistakes. Automate version determination from commit messages.
- Writing changelogs by hand. Hand-written changelogs are always incomplete. Generate them from commit history. If your commit messages are not informative enough for a changelog, the problem is your commit messages.
- No rollback testing. Teams discover their rollback does not work during an incident. Test rollbacks as part of every release to non-production environments.
- Deploying 100% immediately. Progressive rollouts exist to contain failures. Even a 5-minute canary at 10% traffic catches issues that testing missed.
- Rollback criteria that require judgment. “Roll back if things look bad” is not a criterion. Define specific, measurable thresholds that any on-call engineer can evaluate in seconds.