Unless you're 100% in a single cloud in a single region, there's always going to be something which broke which had a dependency on something in the cloud and you can work to mitigate that.
Events like this create increased workload on external and internal customer service teams, people who would normally be engineering can pitch in there. Internally especially hearing from senior dev/ops people helps people.
Shit-post memes about how there's nothing you can do because the cloud is down.
If the outage is on AWS, make up plans for how you should move the infrastructure off AWS onto GCP because AWS has so many outages. If the outage is on GCP, make up plans for how you should move the infrastructure off GCP onto AWS because GCP has so many outages.
Make up plans about how you can scale to multi-cloud muti-region deployments to mitigate against the next cloud outage. Once you finish doing the math realize that you're actually making things more brittle and thus causing more outages and that nobody wants to pay to avoid the hour or three of downtime per year. Realize that the developer who pushed the bad ALTER TABLE statement into production three weeks ago caused a bigger outage than the cloud ever will and begin to question your life choices.
Play some games with the team where you try to guess what sites/services are impacted. Loser has to stay late to close all the tickets generated when cloud comes back up.
Shit-post more memes.
We can be just as affected by an AWS infrastructure outage as anyone. However, we also have the possibility of seeing some of the things that are going on internally to fix the problems. So, we can at least feel better about not being able to do our work for the day.
I can tell you this — being multi-region is a lot harder to get right than anyone gives it credit for. Even just plain multi-AZ is harder than most people give it credit for. There’s a lot of stuff going on under the hood. And there’s all sorts of unintentional dependencies between systems that you didn’t realize until things broke just the right way.
Trying to do multi-cloud? I wouldn’t wish that on my worst enemy.
The way cloud-scale works, there’s always something weird going on somewhere. Always.
A well-architected system will try to be rugged and resilient to those failures, but there’s always going to be a limit to the kinds of failure modes you can predict and prepare for.
We saw that when I was the Sr. Internet Mail Administrator at AOL from 95-97, we see this today in our group at AWS, and my friends at Google have reported the same.