HACKER Q&A
📣 cloudnewbie

Best Tools for High Availability PostgreSQL?


PostgreSQL has lots of different tools available for replication, though in various states of support and maintenance. What are your current recommendations for a small-scale Postgres cluster (e.g. 3 node on modest VMs)?

My use case is a relatively standard web app with database backend. One write master is fine, but it should have automatic failover if the primary is unavailable. As a solo dev, I'd like to avoid the overhead that comes with tools like Kubernetes, if there are suitable alternatives.


  👤 hitpointdrew Accepted Answer ✓
The best stack I found for this is:

Keepalived -> pgbouncer -> postgresql

Then repmgr for managing replication and barman for backups.

The stack is nice because keepalived gives you a virtual ip that you point your apps to, then you can promote a standby to primary (or have one auto promote on a failure) and the VIP will flip to the new primary. All in all you get like 5-10 seconds of “down” time when it flips (depending on how aggressive or conservative you want to be with the rise and fall settings).

Edit: caveat you won’t get keepalived to work if you are using AWS and spread your Postgres servers across AZ’s, they would have to be in the same AZ.

Edit 2: You can simplify the setup if you don’t need connection pooling, in that case skip pgbouncer.


👤 moehm
Do you need HA, or do you want to minimize downtime? At work we have something like an "error budget", were we accept downtime but try to minimize it. As such we have two nodes with one floating ip and a shared disk. The switch over takes as long as stopping the database on the first node, starting up the database on the second one and switching over the ip. Stuff like kernel updates takes us <1 minute of scheduled downtime, which is good enough for us.

Here is a good talk which resonated with me from the last pgconf: https://www.youtube.com/watch?v=_rYP6xVymtI

If you want more, I think Patroni (by Zalando) is the current best option for you. Patroni handles automatic leader election if the master goes down, and it is open source. Read here more:

https://github.com/patroni/patroni


👤 yen223
I used Amazon RDS in my past job, which supports automatic failover and read replication. It was fine. Fairly low-maintenance on my part, which is always good.

👤 ahoka
Do you really need HA?

👤 Malidir
Run in Docker?