HACKER Q&A
📣 register

What are some examples of cloud lock-in?


We are shaping a cloud strategy and would like to understand what are typical examples of cloud vendor lock-ins. We are considering OpenShift as way to reduce any potential vendor dependency and would like to hear from the community what is their opinion.


  👤 zwkrt Accepted Answer ✓
Having worked at AWS and then a litany of other companies in Seattle that are mostly using AWS or Google cloud, here's my perspective on some lock-in that you might not have actively on your mind:

  * Larger companies generally have contracts with cloud providers to pay lower rates. Sometimes these contracts include obligations to use a technology for a certain period of time to get the reduced rate.
  * Any technology that isn't completely lift-and-shift from one cloud provider to another. It used to be that a JAR deployed to a 'real' host (say EC2) that accesses config through environment variables and communicates through HTTP was the gold standard here. Now docker broadens the possibilities a bit
  * All the cloud providers have annoyingly different queueing/streaming primitives (SQS, kinesis, kafka  wrappers...). So if you are using those you might find it annoying to switch
  * Even for tried-and-true technologies like compute, MySql, K/V stores, cloud providers offer lots of "Embrace and Extend" features.
  * If you are wise then you will have back-ups of your data in cold storage. Getting these out can be expensive. Generally getting your data out of the cloud and into another cloud is expensive, depending on your scale.
IMO the only way to truly avoid lock-in is to use bog-standard boring technologies deployed to compute instances, with very few interaction patterns other than TCP/HTTP communication, file storage, and DB access. For all but the largest companies and perverse scaling patterns, this will get you where you are going, and is probably cheaper than using all the fancy bells and whistles offered by which ever cloud provider you are using.

👤 sparker72678
I find the biggest source of lock-in is that it's really hard to move a bunch of data. The longer you stay, the harder it is to leave, because the impact (downtime, slowness, resources allocated to moving data instead of building your business) on your product/service is likely to be bigger.

As you research this, don't neglect the cost of attempting to remain vendor agnostic. Every level of abstraction adds new costs (for some people it's worth it, no doubt!). Sometimes it's more efficient to just go all-in with a vendor.


👤 jrudolph
Other commenters have covered the workload lock-in angle pretty well. Using Kubernetes as a target platform for your application already gives you a decent shot a workload portability. Keep in mind though that some K8s APIs are leaky abstractions. You pay with lock in into K8s of course. At the end of the day, lock-in is a matter of tradeoffs.

An often overlooked angle is the "organizational lock-in" to the cloud. Adopting the cloud in any serious capacity with more than a handful of teams/applications means that you will eventually have to build up some basic organizational capabilities like setting up resource hierarchy (e.g. an AWS Organization with multiple accounts), an account provisioning process, federated authentication, chargeback... See https://cloudfoundation.org/maturity-model/ for an overview of these topics.

To be honest I have seen quite a few enterprise organizations that went through so much organizational pain integrating their first cloud provider that implementing a second provider is not really that exciting anymore. Now of course, if you plan on eventually leveraging multi-cloud anyway you can save yourself a lot of pain by setting things up with multi-cloud in mind from day one.

A good read on the topic is "Cloud Strategy" from Gregor Hohpe https://architectelevator.com/book/cloudstrategy/


👤 sokoloff
My 2¢ is "most companies should not avoid vendor lock-in, but rather should lean in and make maximum productive use of the tools and features that their chosen cloud vendor provides". Engineer time and attention is more expensive than most people give it credit for and designing for a future, seamless cross-cloud migration is building a gold-plated pyramid of YAGNI for most companies.

👤 ricksebak
Engineering staff, for sure.

If my company decided all of a sudden to move from AWS to some other cloud (and nearly all of my experience as an engineer is with AWS), that’s a big headache for me and a lot of technology that I would have to re-learn. Or I could just go find another shop who is staying on AWS, and probably get a pay bump for myself too.


👤 starik36
Not necessarily programming related, though I used programming to get myself out of this pickle.

Over the years I've created many Google Photos accounts for various trips and events (and to keep things free since 15GB is comes with the account). Now I wanted to consolidate everything into a single account. You would think it would be easy, but it's not. You can move photos from one account to another, but not painstakingly created albums and other customizations.

I've had to use a combination of Google Takeout and Google Photos API (which, in itself, is such a half-ass implementation) to move everything intact to a new account.


👤 kkfx
Much depend on what you count as "real" lock-in: let's start with some dummy example: all cloud vendors offer some APIs there is no formal lock-in there, you can use them push and pull data as you wish BUT came back a minute: you USE THEM, witch means you craft something yours on top of certain third party APIs. If they change you have to change. If they do not work your service will not work (at least not normally) as well. You might say "hey, but almost ALL software use someone else code", yes, sure but cloud APIs means code you do not run, it's code that run on someone else iron. That's a TERRIBLE lock-in even if in formal terms data and logic on top is at your disposal.

Habits is another form of soft but VERY HARD in practice lock-in. Let's say your employees already know on average, at least a bit, Zoom or Teams or Meet. There is no lock-in in choosing one of such platform, formally. In practice you get a certain UI for users, they get to know it, they are lost in case of change, MOST users at least. Oh, just try to see the sorry state of development of VoIP open tech, most sysadmins nowadays even HATE deskphones...

There are many examples as such NORMALLY not called lock-in but in practice one of the hardest to break soft-lock-in.

Oh, another form is: you decide to quit let's say Backblaze, ok. No lock-in formally... Just... Where you want your gazillion of Pb of crap ahem backups?


👤 Sebb767
One thing I haven't seen mentioned yet: The infrastructure setup. If you're only running a VM or a container this won't be a problem, but if you have any setup that creates stacks on demand, needs dynamic DNS entries, does service discovery or similar - which you will most likely have in any medium to large setup -, you'll discover that switching cloud providers will involve a lot of friction in your application. This will be especially fun if some steps are synchronous (i.e. single API call) with cloud provider A, but callback-based with cloud provider B.

Terraform can help a bit, but a lot of examples I've seen are very AWS-dependent and it won't be as simple as changing the API key to deploy it somewhere else (however, you'll still have somewhat of a documented infrastructure, so there's that). OpenShift and Kubernetes help a lot, but you'll be paying extra for using the non-native abstractions of your specific cloud and, at least in my experience, some quirks will still end up in your app - most likely somewhere in inbound routing and monitoring.

That being said, vendor lock-in is a big topic, but depending on your situation, your really need to look at how much risk you are mitigating for your effort. None of the big clouds is likely to shut down unexpectedly (not even GCP) and no matter how much you prepare, moving a large infrastructure from cloud A to cloud B is always going to be both expensive and time-consuming; you will not do this for a minor reduction in the bill. If you really want to avoid lock-in, the actual way is to go multi-cloud, but this will be a lot of extra effort and I'd wager the expense is not worth it for most companies (except for backups).


👤 alphabettsy
Datasystems or services like DynamoDB on AWS, which are not compatible or available on other platforms.

Things like Security Groups can be another one, they don’t necessarily translate directly to what you’ll find on other platforms.

You don’t mention what clouds you’re evaluating, but avoiding services that aren’t available elsewhere or at least don’t have wire compatibility with those available elsewhere would be my recommendation.

ex. AWS Kafka over SQS or Kinesis.


👤 samtho
I’ve been consulting on contract exclusively for the last 2.5 years, and the biggest issue I’ve seen is the over reliance on AWS Lambda. People tend to go crazy because they are so easy to spin up, however, they quickly find themselves with a runaway AWS cost. This problem of infrastructure cost, as being significantly cheaper than development cost thought to be resolved by using hyper-scaling could providers, becomes salient once again. When your AWS spend, relative to your revenue, starts to impact your ability for additional headcount, something is really wrong. The problem with how teams I’ve worked with use Lambda is that they tend to use all the latest AWS specific features by rejecting the use of abstraction framework. This makes it hard to move shop to a cheaper provider and they instead opt to re-unite the functions into a single executable application and we’re back to an express app deployed on EC2.

👤 adra
OpenShift is just k8s with batteries included so the batteries are in a real sense a lock-in to the OpenShift community so take that trade-off as you like. I wouldn't mind, but since your initial post was in the fear of lock-ins, this may be a start.

Other examples: - full spec S3 (the basics are well copied at this point) - GraphDBs I've found are very different in many ways between cloud vendors - k8s load balancer bridges are different for each vendor though the big three have more or less feature parity with one another, just different impls

Just as with OpenShift, you'll start to see trade-offs between vendor convenience, 'lock-in', and cost which you'll ultimately have to choose what's more important to your business in the end.


👤 hdjjhhvvhga
Just have a look at the list of AWS services and you'll find over a hundred of examples. But when you look deeper, it turns out people get locked in just because they want, not because they have to. A good example is ECR. You really don't have to use it to use Docker effectively on AWS, but it's slightly more convenient than spinning up an EC2 instance with a private Docker repo. Also it's well documented, many people use it and so on. So you get hooked up, you write scripts, and when finally you start thinking about switching you realize the sheer amount of work needed to modify all this is just scary. So you say, "I can't afford the downtime" and continue with AWS.

👤 lee101
Micro example is that you can't scale rds/cloud db disk size downward easily, only upwards, so if you haven't sorted out go archiving before hand you may be stuck paying more for storage until you have both migrated data away and done the extra leg work to scale back down.

Macro examples: cases where there are incentives to use for a period e.g. sustained use or yearly discounts Incentives to use proprietary technology such as S3 or dynamodb being cheap. Situations where migrations are hard (data), expensive (cold storage) or dangerous/slow to recover such as changing the DNS


👤 philip1209
Another one: IP addresses. If you offer custom domain functionality, it prohibitively hard to reach out to all of your customers to coordinate "Switch your A record from this IP to that one".

👤 LinuxBender
At a former company our development teams addressed this by making their own API gateways to talk to API's in AWS and Azure so that the same primary codebase could run in multiple clouds. I do not have specific details but this is absolutely a concern. I am not aware of a turn-key solution to address this. OpenShift would still need code to talk to all the API interfaces of each cloud vendor I have no idea if this is already baked in at this point. It is a bit of work up front but worth while in my opinion.

👤 sam0x17
Monitoring and logging dashboards and stuff is almost entirely vendor specific unless you roll-your-own via structured logging and even then how you consume it can be a migration pain-point.

👤 Sirikon
The most obvious one is exclusive products or features that only exist within a single cloud provider.

Binding your scripts to a non-standard API will complicate any migrations outside of it and involve a lot of work. (example: Migrating away from Azure Resource Manager templates).

AWS outbound data is expensive. That complicates any data migration outside of AWS or communication between machines inside and outside AWS (example: Cheaper machines on another infrastructure provider with a lot of data transfers with AWS machines).


👤 heax
If you want the cloud to be a powerful tool you will use their higher level services. These are incompatible and cannot be reasonably abstracted away.

The simple services like files which you could reasonably abstract away in your code you can also migrate when needed.

If you don't want lock in you might be better of with a traditional hosting provider. The cloud is only really useful if you go in with a mindset to embrace what it offers.


👤 yellow_lead
Some OpenShift specific advice:

There are OpenShift components that are not present in native k8s - i.e the OpenShift router. Also, OpenShift dashboard, and some management tools. All OpenShift commands use `oc` instead of `kubectl` as well. If you rely heavily on this stuff in build scripts, processes, or running applications, migrating at some later point could be a good amount of engineering work.


👤 l0b0
Having worked briefly with OpenStack and some years with AWS, I'd say OpenStack is great for what it does, but it does only a tiny fraction of AWS. For example, one of the more annoying findings was that security controls were utter garbage compared to AWS's policies. That said, if it does enough for you then avoiding lock-in is worth quite a lot in the long run.

👤 tester756
I've heard that every year we're having less people with "traditional" admin skills, or not willing to work for non-devops salaries

so it seems like in a few years that's gonna be the biggest vendor lock

cuz admins (or actually their new name: devops) will only be able to operate on their abstractions.yaml provided by $cloud


👤 comprev
The cost of "lock-in" is often less than building a "cloud agnostic" product.

Trying to work around cloud vendor specific nuances can increase the risk of that component/feature failing. The increased risk and longer development time might not be acceptable to management.


👤 khalidx
The two largest cloud lock-ins, hands down:

- IAM (think Active Directory, AWS IAM, SSO) - Data Egress (how much does it cost to get all my GBs out again?? Answer: a lot)


👤 andrewstuart
Cloud is very expensive and very complex.

Evaluate costs against self hosting.


👤 MarcoSanto
So I am contributing work to a product that is solving this exact problem. Would you mind giving it a look and see if it fits your requirements? www.nuvolaris.io

👤 andrewstuart
It costs 9 cents per gigabyte to get data out of the big clouds …. AWS, Azure, Google.

If you’ve got alot of data in there, it may never come out.


👤 Cullinet
The most vexing secondary effect lockup to Azure comes when you need Microsoft licensing that Microsoft discounts on Azure and offers palliative adjustments to smaller resellers of cloudy systems together with Microsoft licence agreements, which I could almost ignore using up my quota of enterprise licensing cynicism, except for the fact that margin extraction from competing cloudy resellers can't be ohbo for a probable effect on the level of hardware instances purchased for Microsoft customers who I believe are getting depressed specs as a consequence of this squeeze. Which inevitably has a compounding effect on platform renewal schedules and planned performance purchase points that can only push the package customers get downwards.

I have been suspicious that whilst I am not convinced that the impact is so directly causal simply because of the relative small scale of independent clouds selling Microsoft contracts, nevertheless it could easily be this preferential self dealing the motor for slower upgrade cycles at lower budget defined configurations leading to increasing compression of the options available for Microsoft capacity and anecdotally I've found it increasingly difficult to find equivalent instances outside of Azure which if it isn't a anti competitive practice is most certainly a very harsh environment for resellers which has real effect on customer independence and I will surmise that Microsoft probably sees its position in ten years as being a much bigger and more attractive single source by default like Oracle. If being much more attractive than Oracle is attractive to you I would like to hear how. At least for Oracle, now that installs are nearly only hard mission critical F500 budget full metal jacket affairs, I can rationalise the Oracle position because Oracle at size is going to run on Oracle hardware. But the thought that Microsoft is lurching down the same tortuous path only redeemed by the fact that it's almost impossible for competition to follow after you, and taking the whole intensity of x86 competition off the table and with that a huge part of the value proposition Redmond ought to be nurturing far better than this in terms of passing on the difference between economy of scale plus advantage of platform innovation competition and that sum less some reasonable vig, simply is abhorrent not least because having a dominant swing volume customer gain insensitivity to innovation benefits is tremendously bad for the industry ecosystem entirely. This won't have to carry on for long before I will conclude that Microsoft is going to be a ARM vertical within the next ten years.


👤 VLM
I was involved in a long term project/service that moved from private bare metal to private vmware to private openstack to private vastly newer vmware to AWS.

You'll probably laugh but the biggest problem we had was security, moving from one thing to another is possible but the "in between" needs to be as secure as normal operation so you write a LOT more security group style access lists than you'd think. If your process involves five middleware servers there are a large combination of partial moves possible which requires security review of each set of access lists.

The second biggest problem we had with "always be moving" was latency. "Sure, we can move the mysql server today and the front end servers next month". Then insert a surprise 30 ms latency that nobody ever expected and suddenly round trips go from immeasurable and unimportant to being a pretty big idea while parts have been moved. Also funny watching front end people who did "SELECT *" and filter in their frontend because its a 10G connection find things are a little slower when the db is far away on the internet.

The third biggest problem we had was documentation. "Hey its Tuesday so does anyone know where the 'ohio' database is today?" Anybody can move stuff, bigger question is can everyone find it, LOL? Whatever you use, a wiki or some more elaborate project planner, everyone needs to be able to find where "stuff" currently is. How many people need to know where things are and what's the labor cost of moving something?

The forth biggest problem which has kind of gone away with Docker is the old fashioned version bump. "Well, we're moving it anyway, and it would take extra work to either upgrade the old xyz-server from 3.99 to 4.00 or install 3.99 on the new cloud, so what could possibly go wrong?" Turns out to be a lot, mostly performance tuning related although occasionally "ha ha we deprecated that feature two years ago so we removed it last week because nobody would notice". So try not to merge an upgrade process with a cloud conversion process if at all possible.

The fifth biggest problem we had was budget. The bean counters liked how the electric bill was about constant and the bare metal servers were a fraction of HVAC and lighting although you'd have to replace a HDD every couple YEARS. Suddenly "operations" became bean-countable with the cloud and different clouds count different beans so suddenly someone else is driving "how we're going to save money" because overtime and weekend labor is "free" if its salaried but god help anyone provably "wasting" 25 cents on the AWS bill, and the best way to get a promotion is to force someone elses team to work saturday and sunday unpaid salaried to save fifty cents of AWS budget. The lock-in is your internal procedures will depend on some cloud providers crazy arbitrary billing ideas, they're not all on the same page. The endusers will never accept any explanation that's truthful along the lines of "sorry we can't change the font on that webpage because of AWS" which in enormous technical detail would be an entire page of tech and billing nonsense.

I think really the meta-observation of cloud lockins is the front line salaried employees don't like putting in 30+ hours of unpaid maint window overtime just to lower the cloud bill by $5 so some manager can get a promotion. "Technically possible to move and reduce the bill" doesn't save money if employees quit or essentially rebel.