I was curious about when multiple environments became 'standard', but it appears to be a thing for as long as the internet can remember. Can someone who has been writing software since before the internet remember why we started to do things this way?
As you say, it doesn't necessarily always work, but that's the notion.
It's interesting that most of the responses say we moved to multiple environments "because production matters". Yet, your experience says that a single environment is leads to less downtime in production, not more.
Is there any actual data on this in general? E.g., some study of downtime in single vs multiple environments? So much of software engineering "best practices" seem to amount to littler more than herd opinion, but rarely have anything substantive to back them up except somebodies experience.
We tried dev test staging and prod first in the 90s when internet arrived with cgi-bin and later Java, but it was too much hassle without much benefit, so we stuck with dev-prod or just prod.
Fast forward to today and I believe developers shouldn't have access to prod data, ever, even to debug things. Privacy and security of user's data are more important than having a nice dev experience every time. If your work in an ISO27001 environment you don't really have much choice.
What's "standard practice?" It's not standard practice because we don't have standard practices. We just have common misconceptions.
Once upon a time, deploys happened (at most) as frequently as engineers received paychecks. To test anything in an environment between paychecks, they needed to put the code somewhere else - a different environment.
Code likes to be deployed, though. It stays healthier the more frequently it's deployed. So keeping deploys tied to paycheck frequency ought to be a thing of the past.
But people we claim "we have to test it before production!" Ok. Contract testing is actually manageable now, thanks to Pact/PactFlow (love ya, SmartBear!). That means most environment-bound end-to-end testing can finally be replaced with service-level tests (with trivially reachable 100% coverage) plus contract tests . . . none of which require environments. So we can test all of it before production, test it better than before, and also (at least mostly) get rid of those crummy environments.
For anyone doing lean, those end-to-end tests are a huge source of waste for test teams. They require constant environment maintenance, data maintenance for the environment, in companies with multiple repos they require a way to know state across repos, and we haven't even talked about the test code itself yet and how brittle code running across an entire stack can be. Get rid of those and suddenly you free up test cycles to focus on things like helping testability or focus on previously overlooked areas - things that proactively make the product better instead of reactively hoping to catch regressions.
Most software engineers don't think about solving the biggest source of waste in testing because they're focused on features. Most "SDETs" today don't have the curiosity or imagination to change the system and they can't do it with Selenium anyway. I don't think most managers are concerned with actual process improvements so much as measuring the right KPIs and managing up. And no one else is in a position to even realize there's a problem. So . . . for the most part, nobody's looking to fix the problem.
To be clear, that's a warning that touching this kind of problem isn't doing any favors to your career. You're better off focusing on solving problems that get you promos until you jump to the next company and get a real raise.
Mainframes and mini-computers had test batches run on the same hardware as the real batches.
Later for OLTP/continuous bank (ATM) systems, the big variable was storage: when to attach a disk you've been hoarding since it could take a quarter or more to get another one, but you're flirting with capacity issues. So there was some relatively ad-hoc reconfiguration of production machines.
eBay was probably the first internet company to really scale to national/international always-on services, first with perl then java. They had (have?) two-week release "trains", with hundreds of branches being merged in a topological order (in parallel: merge processes happening for multiple trains). What train you were on determined what services and resources were available. Only the best of the best actually did the merges. Google watched eBay tangled in IBM ClearCase and decided: one repo, no branches. The now-comment 3-stage environment (test, staging, production) is relatively simple, depending on your data pipelines.
1) 0 non-production + production
2) 1 non-production + production
3) 2 non-production + production
4) 3 non-production + production
5) N non-production + production
6) Something entirely different.
I was thinking about branch based environments or per developer environments but was not sure how to depict that in a simple poll. Feel free to chat amongst yourselves.
no but for real, i think mid 90s is right for this.
With kubernetes, dev environments are a given now, but when we used bare-metal that was expensive, we would even provision a box in the office as staging =)
Funny thing is after that I worked for a big company with a lot of “best practices” and except for me, for about six glorious months, nobody was ever able to run the full product on their dev machine.
So I assume there is still a long way to go in most companies. The bugs I see in FAANG software are also, as often as not, of the “clearly the dev never really ran the code” variety.
That’s before any airline would use internet to talk to each other, but using their own network.
I think it was mostly motivated by allowing airlines to train their agents on a dedicated system but also rehearse their migration from their system to a GDS without perturbing any other airlines flights (or their own for that matter).
I was the first programming hire in 1997 for a small electronic commerce company, being the first hire they had no systems and I was tasked with figuring it all out. Having written and managed in entirety one of the first API payment servers on the internet a problem quickly surfaced, near immediately actually, how do people learn to comprehend what it does without seeing it do it? Since there was nothing to look at, that mysterious 'black box' system when you swipe your payment card, the only solution was to have a 'fake' one that just didn't talk to the card networks. So the word 'fake' was not very well received in the business circles and almost immediately another issue arose - What do people code against to validate their own API client payment calls? Simple, I'll refactor the 'fake' API payment server to process API development test requests also. For those that have lived it and comprehend where this is going one can see the 'fake' payment server quickly becoming its own living and breathing platform to support. This led to the creation of four distinct separate environments all that matched from hardware up through software versioning in a staggered staged rollout window originating around 1999. I labeled the environments as DEV, QA, UAT, and PRD and each one had its own designated critical business flow reasoning. DEV was the environment the internal LAN technical parties tested against and to clarify this was NOT some developers personal system they kept running but was a match of PRD. QA was internal LAN quality assurance for all business test cases that needed stability outside of DEV along with load testing. UAT was external WAN User Acceptance Testing for new customer development as well as existing customers new feature development. PRD was of course external WAN and live production. This approach was tried and true tested leading up to Y2K and greatly alleviated countless PRD issues as I had established a code promotion procedure from Development to DEV, DEV to QA, QA to UAT, and then UAT to PRD. One can see three distinct code pushes before reaching PRD and this significantly increased uptime when such things were just becoming a topic.
"Can someone who has been writing software since before the internet remember why we started to do things this way?"
Clearly I am "old" as I sit here trying to remember how I remember this. :) Been through, done, and seen some crazy things in the last 30 years but what I'm building now…
As for your current place: single environment won’t make them stop breaking things. Deploying much more often will. The staging environment, or gradual rollout in prod, doesn’t help if you have a lengthy manual QA process so you need to force deployments into a situation where there’s no time for slow and manual.