HACKER Q&A
📣 htneak

Why is data-related cloud architecture so complex


In the years before the cloud became the new and exciting default for big and not-so-big data solutions, we had a relatively simple way to design and build data management projects. You had a database layer, maybe a data manipulation tool (if at all), and your efforts went into understanding and modelling the data. This was for all but the largest data volumes.

Now, cloud providers seem to have moved on to selling us these super-complex multi-services modern architectures. To implement those we are looking at more than 6-7, sometimes 10+ completely different technologies. From including your file storage (and calling it a data/delta lake) in the mix, through clusters of compute instances on top of it (eg spark), then an mpp system besides those, etc, etc, we seem to be making it all quite complex, when in fact neither the data volume, variety or velocity demand such treatment. It does seem like we are just throwing more and more software to solve non-existing problems and feeding an army of professionals along the way. Sure, for huge setups it makes sense, but I’m seeing companies handling data sets which can fit in sqlite buying into this and just going with it for dubious reasons.

It does make sense from the point of view of cloud providers and consultancies and I am wondering if it seems a little over the top to everyone or just me..


  👤 brodouevencode Accepted Answer ✓
There's nothing stopping you from reproducing your legacy/monolithic architecture in the cloud. But to your point the options are endless, often confusing, and sometimes overlapping.