Feature Flags: Decoupling Deployment from Release#

Deployment and release are not the same thing. Deployment is shipping code to production. Release is enabling that code for users. Feature flags make this separation explicit. You deploy code that sits behind a conditional check, and you control when and for whom that code activates – independently of when it was deployed.

This distinction changes how teams work. Developers merge unfinished features to main because the code is behind a flag and invisible to users. A broken feature can be disabled in seconds without a rollback deploy. New features roll out to 1% of users, then 10%, then 50%, then 100%, with a kill switch available at every stage.

Flag Types#

Boolean flags are the simplest: on or off. Use them for kill switches (disable a broken feature), feature gates (enable a feature for beta users), and operational toggles (switch between code paths).

Multivariate flags return one of several string or number variants. Use them for A/B testing (variant A shows a blue button, variant B shows a green button) and configuration management (variant “fast” uses a smaller batch size, variant “thorough” uses a larger one).

Percentage-based rollout splits users by percentage. 5% of users see the new checkout flow, 95% see the old one. Increase the percentage as confidence grows.

User-targeted flags enable features for specific users or groups. Enable the redesigned dashboard for the engineering team, the sales team, or a specific list of beta testers – regardless of the percentage rollout setting.

Use Cases#

Progressive rollout is the primary use case. Deploy a new payment processor integration behind a flag. Enable it for 1% of transactions. Monitor error rates and success rates. Increase to 10% if metrics look good. Continue until 100%. At any point, flip the flag off to instantly revert to the old processor.

Kill switches provide an instant escape valve. If your new search algorithm causes timeouts during peak traffic, disable it without deploying anything. The response time is measured in seconds, not the minutes or hours a deployment pipeline takes.

Trunk-based development becomes practical with flags. Instead of long-lived feature branches that diverge painfully from main, developers merge work-in-progress code behind flags daily. The code is in production but inactive. This eliminates merge conflicts and keeps the codebase integrated.

A/B testing uses flags to show different experiences to different user segments and measure the impact on business metrics. Which version of the pricing page converts better? Ship both behind a multivariate flag and measure.

Platform Comparison#

LaunchDarkly#

LaunchDarkly is the most mature feature flag platform. It is SaaS-only, with SDKs for every major language and framework. The targeting engine supports complex rules: enable a flag for users in a specific geography, on a specific plan, who signed up after a specific date. The experimentation features integrate flag evaluation with analytics to measure the impact of feature variants.

Choose LaunchDarkly when you need advanced targeting rules, experimentation capabilities, enterprise SSO and audit logging, and you have the budget. Pricing is typically $10-20 per seat per month, which adds up for large teams.

The SDK architecture uses a streaming connection to receive flag updates in real-time. When you change a flag in the LaunchDarkly dashboard, all connected SDKs receive the update within milliseconds. Local evaluation means the SDK evaluates flags without making a network call for each evaluation – it maintains an in-memory cache of flag rules.

Unleash#

Unleash is open-source with an optional cloud-hosted offering. It provides a solid feature set including gradual rollout strategies, user targeting, and Prometheus metrics integration. The self-hosted option runs as a single Node.js application with a PostgreSQL database.

Choose Unleash when you want to self-host, when cost is a factor, or when you need basic-to-intermediate flags without enterprise pricing. Built-in activation strategies include gradual rollout (percentage), user IDs, IPs, and flexible rollout (combine multiple criteria). Custom strategies let you implement arbitrary evaluation logic.

Flipt#

Flipt is open-source, distributed as a single binary, and exposes both gRPC and REST APIs. It stores flag configuration in a local database (SQLite, PostgreSQL, MySQL) or can read from a Git repository or object storage for a GitOps workflow.

Choose Flipt when you want minimal infrastructure overhead and simple boolean or percentage-based flags. It has no user management – purely a flag evaluation engine. The GitOps mode is notable: define flags as YAML in a Git repository, and Flipt watches for changes, putting flag configuration through the same review process as code.

OpenFeature#

OpenFeature is a vendor-neutral API standard for feature flags. Instead of coding against LaunchDarkly’s SDK or Unleash’s SDK directly, you code against the OpenFeature API and swap providers without changing application code:

from openfeature import api
from openfeature.contrib.provider.flagd import FlagdProvider

# Configure once at startup
api.set_provider(FlagdProvider())

# Evaluate flags using the standard API
client = api.get_client()
show_new_checkout = client.get_boolean_value("new-checkout", default_value=False)

Switching providers means changing the initialization, not every flag evaluation call in your codebase.

Implementation Patterns#

SDK Initialization and Resilience#

Initialize the flag SDK at application startup with connection pooling. The SDK maintains a local cache of flag rules so that evaluations work even if the flag service is temporarily unreachable:

import (
    "github.com/launchdarkly/go-server-sdk/v7"
    "time"
)

config := ld.Config{
    Events: ldcomponents.SendEvents(),
}
client, err := ld.MakeClient("sdk-key-xxx", config, 10*time.Second)
if err != nil {
    // Flag service unreachable at startup -- use defaults
    log.Printf("flag service unavailable, using defaults: %v", err)
}

// Evaluate with a fallback default
showNewUI := client.BoolVariation("new-ui", user, false)

The fallback default (the false in BoolVariation) is critical. If the flag service is down, if the flag does not exist, or if evaluation fails for any reason, the application uses this default. Choose defaults that represent the safe, known-good behavior – typically the existing functionality.

Flag Evaluation in Hot Paths#

Modern SDKs evaluate flags locally against cached rules – each evaluation is a microsecond-scale in-memory operation. Do not cache flag results in your own code. The SDK already does this, and your cache would prevent real-time updates from taking effect.

Flag Cleanup#

Feature flags are technical debt. Every flag adds a conditional branch that developers must understand, test, and maintain. After a flag has been rolled out to 100% and baked for a reasonable period (one to two sprint cycles), remove it:

Confirm the flag is enabled for 100% of users and has been stable.
Remove the flag evaluation from application code, keeping only the “on” path.
Remove the flag definition from the flag service.
Deploy the cleanup.

Track flag age in your flag service. Flags older than 90 days that are still at 100% rollout are cleanup candidates. Some teams add lint rules that flag (no pun intended) stale flag evaluations in code.

Testing with Flags#

Test both states of every flag in your CI pipeline:

func TestCheckout_NewFlow(t *testing.T) {
    // Test with flag enabled
    client := testFlagClient(map[string]bool{"new-checkout": true})
    result := processCheckout(client, order)
    assert.Equal(t, "v2", result.Flow)
}

func TestCheckout_OldFlow(t *testing.T) {
    // Test with flag disabled
    client := testFlagClient(map[string]bool{"new-checkout": false})
    result := processCheckout(client, order)
    assert.Equal(t, "v1", result.Flow)
}

If you only test the “on” path, you will not discover that the “off” path (your default, which runs for the majority of users during rollout) has a bug.

Kubernetes Integration#

ConfigMaps can serve as simple feature flags, but they lack targeting, gradual rollout, and real-time updates. For production use, run the flag service (Unleash, Flipt) as a deployment in your cluster. Application pods connect over the cluster network, keeping evaluation latency low.

Operational Concerns#

Flag sprawl happens when teams create flags freely but never remove them. Enforce cleanup with automated alerts for flags at 100% rollout for more than 90 days.

Stale flags are enabled for all users but the evaluation code remains. They add dead branching logic. Remove them.

Flag dependencies create subtle bugs. If flag A assumes flag B is enabled and someone disables flag B, flag A breaks unexpectedly. Document dependencies explicitly and use flag prerequisites where your platform supports them.

Common Gotchas#

Treating flags as permanent configuration. Feature flags are temporary by design. They exist to de-risk a rollout. Once the rollout is complete and stable, the flag should be removed. Long-lived operational toggles (debug mode, maintenance mode) are a different pattern that should be managed differently – typically as application configuration rather than feature flags.

Not testing the off path. During a progressive rollout, the majority of users see the “off” path (the default). If that path has a bug, you have broken production for 90% of your users. Always test both flag states.

Database schema changes behind a flag. If flag-enabled code expects a new database column and flag-disabled code does not know about it, you need the expand-contract pattern: add the column first (both code paths ignore it or handle it gracefully), enable the flag to start using it, then clean up the old code path after full rollout. This is the same principle as canary deployments with schema changes – both versions must work with the same database state.