HACKER Q&A
📣 __warlord__

Best practices for multi-datacenter infrastructure?


I'm deploying a set of applications on top of Kubernetes like GitLab and Argo suite (Workflows, CD, Rollouts), Harbor, Prometheus, and more and I need to have an active-active solution distributed across two data centers for all (if possible) of these tools.

One approach I'm investigating having Kafka queueing the commits (and other requests) and one consumer on each data center writing the commit into GitLab to execute CI/CD pipelines and deploy applications based on that commit. (CI/CD will only execute once)

This way I don't have to worry about replicating each application data from one data center to another and worry about where data is being written and so on.

I want to ask HN, are there best practices or documentation on how to do this kind of architecture that you can point me at?

Or is there a better way to work on this solution?


  👤 NicoJuicy Accepted Answer ✓
I did a quick check, but I'm not an expert.

Make one the main and the backup in readonly.

https://docs.gitlab.com/ee/administration/read_only_gitlab.h...

You can stop the readonly from doing CI.

Do a healthcheck on the main repo and let a gateway in front it.

If the main is down, undo the readonly access.

I'm not 100% sure what to do when the main goes backup though.

That aside, I'm not sure if you want this. What duration/year was a downtime on the datacenter blocking deployments/coding?


👤 wmf
Modifying each application to add active-active replication sounds like a nightmare. I would seriously question where these requirements are coming from.