Does your team use them? If not, why not?
I can't imagine working without feature flags. Being able to enable new features in particular deployment rings (canary, dogfood, various production rings or regions), or per users / user groups, enabling gradually (percentage) and so on, is invaluable. I really can't overstate this.
Heck, we went as far as using feature flags for risky bugfixes even.
We had also internal tools to easily work with and track feature flags. A downside is that although normally you'd want to remove old feature flags that become obsolete, this hasn't been done very often.
What I suggested and we started doing was to tag the feature flags with the name of the author and the date at which they were added, and the same for the config updates, and usually ticket number and title for both case. This did help with tracking obsolescence, but obviously there was still a need to plan and do the actual work. Automating this process further was out of the question, due to the high risks involved.
Edit: added the last paragraph.
Put simply it disconnects merge and deploy from launch. This is very useful when your changes rely on other teams or third parties having gone live. Having a feature flag (we call them toggles) lets you get it into production turned off, without having to coordinate deploys with other systems.
The downsides are it is extra development overhead and adds tech debt (you should go back in and remove the toggles after a successful launch). Generally we try to devise a solution that can be deployed safely without a feature flag but they are often required.
1) Validating you can handle production-scale
2) Ensuring integrations/environment-related issues don't happen when you deploy
3) Alpha/Beta groups of users
4) Quick reversions when something does not work as expected
Similar to other commenters I can't imagine not using feature flags. Some of these might have work-arounds like an artificial load tester, but nothing beats true production traffic & patterns.
One company created a category of feature flags that they give customers the ability to opt into themselves from their settings page if they want. This lets users selectively try out new features if they want. It's helpful to gauge interest and to get feedback before rolling out to everyone too. Though, that's a relatively small portion of the flags that are generally created
One problem that is typically encountered is setting up and executing a plan to remove the feature flags when they're no longer needed. Once you roll a new feature out to everyone (assuming there isn't someone who doesn't want the change), you should remove the flag and any code that's checking for it to keep things clean.
This let us slowly roll out new features to a subset of users in case there are any issues.
Using feature flags also requires testing the default (not enabled state), ensuring you have a robust realtime configuration manager to control the knobs, and metrics for everything - not just how many requests/customers are opted-in, but also the progress & state of the configuration change.
It does no good to first enable a feature at 1% if only 1% of your servers have received the updated configuration - that's only .01% impact. It also tells you when your rollback is complete - you want to be sure when you disable something that there's not some stuck server with the feature still enabled...
Let's give it 1/100 bad code change, and 100 devs. The probability of a clean push is:
(1-1/100)^100 = 0.36. That's worse than a coin flip.
for 10 devs = 0.90. Still not great, one push in 10 will be bad.
When you're trying to push daily (or even more frequently), this will kill the team's velocity. It's even worse when there are interleaving changes making rollback and re-release impossible before the next release. Rollback and freeze is a problem when you're trying to meet deadlines (either marketing or regulatory).
Feature flags allow the code to go in and each change to be rolled back independently. This lets the rest of the team to make progress while the bug is debugged.
Short release pipelines could be a solution, but they're aren't necessarily sufficient. Time to recovery when there isn't a feature flag is also problem.
For example, some bugs take days/weeks to reveal themselves (think month end), so a "30 min build and full push" is not enough of a guarantee. Time to recovery needs to consider the time to find the change and roll it back, including dealing with any interleaving changes that happened in the meantime.
I personally think feature flags are useful if you're deploying very frequently, but they just add confusion to software that's meant to be released/stable, especially for those developing it (what's with all the half done code and TODOs everywhere?)
The biggest thing with feature flags is using them at an appropriate level of granularity for the stage of company you're at. Branching logic adds complexity.
Consider starting with an entire feature/route/page set to keep the branching logic consolidated and simple. Only move to more granularity as you need it.
We are also doing interesting stuff like controlling what features are in our Open Source Docker container via flags in the platform that are baked into our Docker images when they are built.
Once you start using and relying on flags, it's hard to go back, and helps with a bunch of 'good' engineering processes and patterns.
Another interesting aspect is that once you have gated a feature based on a flag, you can then AB test that feature, almost for free, as you have a way of bucketing users and showing or hiding the feature using most feature flag platforms.
[0] https://flagsmith.com/ & https://github.com/Flagsmith/flagsmith
Personally I think they should be used sparingly for things where you have to be able to configure the same software on different environments differently, for purposes of A/B testing or other such things.
Using them a lot increases your code complexity and is basically just tech debt imo. The gain is that you don't have to run a deployment to turn on a feature? If you invest the time to make your build/deploy process not suck, this isn't a very big win.
I'm honestly surprised to see so much love for them here. I'm going to take some time to read this thread, see if there are compelling reasons to take on the extra complexity that I haven't considered.
Some possible failures that can happen with feature flags: 1) "Accidentally perpetual" - Since feature flags are a part of the code, it's easy to create multiple dependencies on the flag value, which makes it difficult to remove from the code without mysterious null exceptions happening where you didn't expect them. 2) Cross-scope - Using multiple feature flags carelessly can result in situations where one flag value change doesn't do what's expected unless another flag's value is present or set to a certain value. Flags should always be independent from one another, even if they're controlling the same code. Instead of two flags whose values affect each other, you would instead create more flags (4, in the case here if using Boolean flags) to reflect each combined state. 3) Fallback - What happens if the systems or SaaS that supplies your feature flags becomes unavailable? Always consider this.
Feature flags are a great tool, and enabling your team to be able to "test in production" with them can be amazing. However, do watch out for the footguns.
I've now worked at 3 different companies that built feature flags both internally and as a core part of their external product offering. I'm currently at Flagsmith (open source too).
Here are some of the more popular front-end feature flag use cases:
1. Gradual Roll Out: Build a feature and release it to 5% of your users, then increase as you see that it isn't "breaking anything". You might even do this AFTER a successful A/B Test concludes. 2. Test in Production: Build a feature and release it to only your internal team (or QA Team) to see how it works in a real production setting. 3. Feature Gating: Managing access to specific features based on a targeting condition. I've seen people do this for BETA features with key customers pretty often.
Most common reason people don't use them: 1. They are concerned about feature flag creep. Managing them if they aren't deprecated can be a problem worth thinking through ahead of time. 2. They worry about giving access to important parts of their product in production. Thinking about your environment set-up and access control is smart.
Hope this helps!
To support per user or group settings we have a `canary` role that can be set to allow access to new features that have been integrated but not available to the general users. The nice thing about having something tied to roles is that the changes can take effect immediately without the need for redeploying, or reinitializing applications in our footprint. Also, the role based model can be made as fine or coarse grained suited to the app's and user's being served.
We tend to avoid encoding feature flags into URLs because users can bookmark them, revisit via history or navigate from old emails, messages, etc. and we'd rather not expose these flags or have them memorialized anywhere.
- enable percentage rollout only after validating for specific test accounts/ids
- when using percentage rollout, also have a 'killswitch' flag that can negate belonging to the enabled group
- if you don't need to test specific accounts/ids first, you can use only a 'killswitch' rollout flag starting at 100% and decreasing. enabling still possible to remove particular id's from feature enabled group
- best experience was making a helper that everything goes through rather than query feature flag name directly. This lets you test things like what happens in CI/CD if I hard-code that return value to fully enabled. This gives you test coverage using the flag for all the tests that don't mention it at all. The helper can also do any combination of required flags/killswitches for something to be meaningfully active.
- and for expired flags, have a recurring nag mechanism, e.g. Slack post by team/channel owner
1. To hide not-yet-ready features that have production dependencies.
2. To route special users to a different code path.
3. To test new ML code on a fraction of users.
4. After a refactor to a different language or a micro service, you may need feature flag to route requests.
I could go on and on, it's simply mandatory.
Complex: when you roll out a complex feature, it's best to not make it available to all. Instead, focus on a small trusted subset of savvy users who will be easier to train. At the same time, you can use their experience with it to simplify it and make it easier for the rest of the userbase.
Money: we ran a marketplace so dealt with clients' money. We quickly realized that changes to the way payments are processed needed extensive feedback. Even if we assumed something was alright, chances are there would be objections. Rolling out changes in stages would allow our team to handle complaints and feedback without being overwhelmed.
Our React app just uses a simple home-rolled set of keys in localStorage; we have a "secret" route that lets users turn their own flags on or off, and we encourage developers to duplicate big chunks of code while working behind a flag instead of mixing old and new, to make cleanup easier.
(And yeah, someone does have to stomp around every few months and remind everyone to clean up their dead code again. Still worth it.)
We mostly use frontend feature flags for this, so we'd only show the link in the menu or the specific component if the feature flag is turned on.
In my new role, I'm curious what teams do with feature flags post-release. Do you have a good process for cleaning them up? Do they have long term usefulness as a failsafe or for customer/user configuration? Is it really an issue if they just stay in the code forever? Does this cause issues for you?
It's also a life saver when issues arise, though the correct term for this is "operational toggles". Flip a switch when functionality is causing issues and it's gone.
The product was a desktop file synchronization client. We had an API that we would get administer-controlled settings. Feature flags were usually part of that API.
We don't use them at my current job, but it's a much smaller company with many less customers.
I find that reduces the peak/emergency workload; important for small numbers of developers.
Also aids in developing the whole codebase cohesively -- instead of "don't touch the core to add the new feature". Or trying to wrangle separate branches that have a dependency.
Everything new is throttled and feature flagged. A new feature rollout can have 10+ throttles to slowly ramp up. And yes, they’re difficult to keep track at times.
What we don’t have (but should) is better management of who experiences which combination of flags, since they’re done randomly by default.
They allow us to work on new code in the debug version, while insuring stability and continuity in the production code.
{
EnableBackgroundSync: true,
EnableDebugLog: false,
...
}
This also ends up helping find different parts of code that are related in live in multiple modules (eww, I know)...
The problem with feature flags is that (assuming a flag can only be "on" or "off") once you introduce the flags you have 2^n different possible states the system can be in. When you have a bug or a crash, you have to reason about all of those states. If you have even 10 flags, that's over 1,000 combinations!
Does anyone have a different way of enabling "experiments" or quickly rolling back bad changes?
Anyone got any good guides for building a feature flag system from scratch? ( I'm not interested in just importing a dependency, I want to properly understand how this works)
It's almost a requirement to do continuous deployment and working without long living branches.
Don't forget to clean them up with you don't need one anymore. There's work involved with that too.
It also allows us to keep PRs smaller and roll stuff out to prod confidently.
Our feature flags are nice to work with in that you can just add them in code. If the flag doesn't exist in the DB, it is created with a default value. This makes them pretty painless to work with for us.
You can activate feature flags one server at a time as well to roll things out gradually if you want.
We have a simple web ui in our admin site where you can see them, what they are set to, when last updated etc. A good idea which we haven't done yet, is to log who changed it each time, and why as well.
Being able to find flags that haven't changed in a long time is useful to identify ones you can clean up.