HACKER Q&A
📣 kbench

Is DNS Failover a Problem?


I've been in IT for quite a while, mostly in a single shop, and I'm exploring the idea of a small SaaS product for automatic DNS failover. Before diving in, I’d love to gauge how much demand there might be for something like this.

The service would monitor the status of your primary and backup endpoints. If the primary fails, it would automatically update the DNS record via API to point to a backup. It would also send alerts and continue monitoring backup endpoints. Naturally, API credentials would be encrypted for security.

I’d offer a free tier, with a paid monthly plan for advanced features. While some DNS providers (like Cloudflare and Route 53) already offer similar functionality, others—such as GoDaddy—don’t seem to. I’m thinking this could be useful for smaller MSPs, solo developers, or businesses that need a simple, independent solution.

Does this sound like something people would find useful? I'd appreciate any feedback!


  👤 LinuxBender Accepted Answer ✓
Just my take based on my own experience implementing GSLB DNS. You would be starting with competition. UltraDNS has been doing GSLB DNS for some time along with Dyn (now oracle). Your customers would be small to small-evolving-to-mid-sized businesses. Larger businesses use anycast with active monitoring that will remove down edge load balancers from advertisement. Most people are otherwise moving this responsibility to CDN's doing their own anycast and some businesses do a mix of CDN's and their corporate anycast routing. Whether or not this is the right time based on the current state of the market is out of my wheelhouse however. It seems to me businesses are tightening their belts. Even the US government is finally tightening it's belts in preparation for something.

Just opinions from an old cranky retired admin however. Don't let me tarnish your dreams.


👤 mtmail
Nice coincidence I was looking for such a solution last week. We use Cloudflare DNS and their traffic feature. It monitors if our (redundant) endpoints are up and if one is down removes that DNS entry and alerts us. DNS is free, the traffic feature isn't. I was hoping DNSimple, a smaller SaaS provider, would offer this, but they don't. And we thought we build a script ourselves but it's another piece of our infrastructure that could fail so better to outsource that. Cloudflare works for us, I think their traffic feature, for our usecase, is too expensive long-term. If I understand correct your solution would be independent and update for example Cloudflare DNS via their API.

👤 p_ing
Solved problem via [geo] load balancers such as Azure Front door.

The thing to keep in mind is that you need to assume a given DNS server (server server) doesn't respect TTL. TTL should be considered a value that is ignored unless you operate the DNS server and clients. So while you update your TTL, some client in a remote-region-in-the-middle-of-nowhere is leveraging a DNS server that sets all TTL values to 10 days. Now you've got a client that cannot connect.