HACKER Q&A
📣 withinboredom

When did multiple environments become standard practice?


I recently worked for a company that had a single environment, yes, even development was done against production databases. I worked there for awhile and at first, I was aghast at how weird it was. Then when I left and went back to multiple environments; I am again aghast at how complex it is -- not to mention that there isn't much 'force' to push you to learn how to recover from mistakes in production (leading to more downtime than I ever saw with one environment).

I was curious about when multiple environments became 'standard', but it appears to be a thing for as long as the internet can remember. Can someone who has been writing software since before the internet remember why we started to do things this way?


  👤 Dove Accepted Answer ✓
Totally off the cuff, I would say 1995. As to why? Because developing in production often breaks production, and about that time, we started to care about that. Before that, we were happy when things were working at all and didn't really expect them to work all the time.

As you say, it doesn't necessarily always work, but that's the notion.


👤 patrick451
> I am again aghast at how complex it is -- not to mention that there isn't much 'force' to push you to learn how to recover from mistakes in production (leading to more downtime than I ever saw with one environment).

It's interesting that most of the responses say we moved to multiple environments "because production matters". Yet, your experience says that a single environment is leads to less downtime in production, not more.

Is there any actual data on this in general? E.g., some study of downtime in single vs multiple environments? So much of software engineering "best practices" seem to amount to littler more than herd opinion, but rarely have anything substantive to back them up except somebodies experience.


👤 anonzzzies
I still work straight in prod. Clients are happy with how much faster we are than the competition. Usually faster by more than weeks from feature request to live. Sometimes things go wrong, but that happens with dtsp as well.

We tried dev test staging and prod first in the 90s when internet arrived with cgi-bin and later Java, but it was too much hassle without much benefit, so we stuck with dev-prod or just prod.


👤 onion2k
I was first paid to write a proper website in 1998, and having dev and prod was normal then.

Fast forward to today and I believe developers shouldn't have access to prod data, ever, even to debug things. Privacy and security of user's data are more important than having a nice dev experience every time. If your work in an ISO27001 environment you don't really have much choice.


👤 drewcoo
> When did multiple environments become standard practice?

What's "standard practice?" It's not standard practice because we don't have standard practices. We just have common misconceptions.

Once upon a time, deploys happened (at most) as frequently as engineers received paychecks. To test anything in an environment between paychecks, they needed to put the code somewhere else - a different environment.

Code likes to be deployed, though. It stays healthier the more frequently it's deployed. So keeping deploys tied to paycheck frequency ought to be a thing of the past.

But people we claim "we have to test it before production!" Ok. Contract testing is actually manageable now, thanks to Pact/PactFlow (love ya, SmartBear!). That means most environment-bound end-to-end testing can finally be replaced with service-level tests (with trivially reachable 100% coverage) plus contract tests . . . none of which require environments. So we can test all of it before production, test it better than before, and also (at least mostly) get rid of those crummy environments.

For anyone doing lean, those end-to-end tests are a huge source of waste for test teams. They require constant environment maintenance, data maintenance for the environment, in companies with multiple repos they require a way to know state across repos, and we haven't even talked about the test code itself yet and how brittle code running across an entire stack can be. Get rid of those and suddenly you free up test cycles to focus on things like helping testability or focus on previously overlooked areas - things that proactively make the product better instead of reactively hoping to catch regressions.

Most software engineers don't think about solving the biggest source of waste in testing because they're focused on features. Most "SDETs" today don't have the curiosity or imagination to change the system and they can't do it with Selenium anyway. I don't think most managers are concerned with actual process improvements so much as measuring the right KPIs and managing up. And no one else is in a position to even realize there's a problem. So . . . for the most part, nobody's looking to fix the problem.

To be clear, that's a warning that touching this kind of problem isn't doing any favors to your career. You're better off focusing on solving problems that get you promos until you jump to the next company and get a real raise.


👤 BerislavLopac
As the old saying goes, every team has a testing environment. Some teams, however, also have a completely separate production environment.

👤 dehrmann
Once your product matters, you start thinking about a safe testing environment.

👤 w10-1
(fuzzy...)

Mainframes and mini-computers had test batches run on the same hardware as the real batches.

Later for OLTP/continuous bank (ATM) systems, the big variable was storage: when to attach a disk you've been hoarding since it could take a quarter or more to get another one, but you're flirting with capacity issues. So there was some relatively ad-hoc reconfiguration of production machines.

eBay was probably the first internet company to really scale to national/international always-on services, first with perl then java. They had (have?) two-week release "trains", with hundreds of branches being merged in a topological order (in parallel: merge processes happening for multiple trains). What train you were on determined what services and resources were available. Only the best of the best actually did the merges. Google watched eBay tangled in IBM ClearCase and decided: one repo, no branches. The now-comment 3-stage environment (test, staging, production) is relatively simple, depending on your data pipelines.


👤 jensenbox
What does your company use?

1) 0 non-production + production

2) 1 non-production + production

3) 2 non-production + production

4) 3 non-production + production

5) N non-production + production

6) Something entirely different.

I was thinking about branch based environments or per developer environments but was not sure how to depict that in a simple poll. Feel free to chat amongst yourselves.


👤 jhoelzel
I work with a lot of SMBs and usually at the first production breaking bug ;)

no but for real, i think mid 90s is right for this.

With kubernetes, dev environments are a given now, but when we used bare-metal that was expensive, we would even provision a box in the office as staging =)


👤 biztos
We had separate test and production instances in about 1994/95 but that was Pharma (regulated)… when I moved to Web it was very cowboy style for a few years, don’t think we had a safe testing or dev environment until maybe 2001 then the whole thing tanked anyway.

Funny thing is after that I worked for a big company with a lot of “best practices” and except for me, for about six glorious months, nobody was ever able to run the full product on their dev machine.

So I assume there is still a long way to go in most companies. The bugs I see in FAANG software are also, as often as not, of the “clearly the dev never really ran the code” variety.


👤 pestatije
I remember in my second job senior developers would check their bank balances from their terminals, and would admit they actually had write permissions...so i think standard practice might not be so standard after all

👤 gladiatr72
Hrm.. Probably the early to mid 90's. Systems that could reliably provide internet service started to appear at low enough prices that you could justify a microcosm called dev.

👤 jiehong
In the airline world, multiple environments outside of production have been standard since 89 or 1990 as far as I know (maybe earlier, but idk).

That’s before any airline would use internet to talk to each other, but using their own network.

I think it was mostly motivated by allowing airlines to train their agents on a dedicated system but also rehearse their migration from their system to a GDS without perturbing any other airlines flights (or their own for that matter).


👤 eisfresser
I started my first job as a software developer in 1990. We ran only the production environment and of course had lenghty down times. I changed jobs frequently and by about 1994, I found dev and prod environments everywhere. By 1998, dev/test/prod setups were the norm.

👤 bokohut
As with others comments I cannot attest to the "standard" however I was involved in the early foundational days for the multiple environments evolution.

I was the first programming hire in 1997 for a small electronic commerce company, being the first hire they had no systems and I was tasked with figuring it all out. Having written and managed in entirety one of the first API payment servers on the internet a problem quickly surfaced, near immediately actually, how do people learn to comprehend what it does without seeing it do it? Since there was nothing to look at, that mysterious 'black box' system when you swipe your payment card, the only solution was to have a 'fake' one that just didn't talk to the card networks. So the word 'fake' was not very well received in the business circles and almost immediately another issue arose - What do people code against to validate their own API client payment calls? Simple, I'll refactor the 'fake' API payment server to process API development test requests also. For those that have lived it and comprehend where this is going one can see the 'fake' payment server quickly becoming its own living and breathing platform to support. This led to the creation of four distinct separate environments all that matched from hardware up through software versioning in a staggered staged rollout window originating around 1999. I labeled the environments as DEV, QA, UAT, and PRD and each one had its own designated critical business flow reasoning. DEV was the environment the internal LAN technical parties tested against and to clarify this was NOT some developers personal system they kept running but was a match of PRD. QA was internal LAN quality assurance for all business test cases that needed stability outside of DEV along with load testing. UAT was external WAN User Acceptance Testing for new customer development as well as existing customers new feature development. PRD was of course external WAN and live production. This approach was tried and true tested leading up to Y2K and greatly alleviated countless PRD issues as I had established a code promotion procedure from Development to DEV, DEV to QA, QA to UAT, and then UAT to PRD. One can see three distinct code pushes before reaching PRD and this significantly increased uptime when such things were just becoming a topic.

"Can someone who has been writing software since before the internet remember why we started to do things this way?"

Clearly I am "old" as I sit here trying to remember how I remember this. :) Been through, done, and seen some crazy things in the last 30 years but what I'm building now…



👤 chadcmulligan
development against production? no, no, no. What happens when you're testing your delete code? oops. The complexity of multiple environments has increased, but thats no excuse to develop on production.

👤 1123581321
Multiple environments is an old practice. I would roughly correlate the growth of use of Git with the popularity of them, though. It’s easier for smaller/newer/more stretched companies to manage multiple environments when it’s easier to move code and assets to them.

As for your current place: single environment won’t make them stop breaking things. Deploying much more often will. The staging environment, or gradual rollout in prod, doesn’t help if you have a lengthy manual QA process so you need to force deployments into a situation where there’s no time for slow and manual.