HACKER Q&A
📣 mekster

What's the longest uptime you have for a server?


I once had a server I setup for a company's internal use and the other day I just looked at its uptime and it was around 3000 days which was interesting.

What are your tips to keep a server running for a very long time?

Do you ignore restart warnings after kernel updates?

Do you actually actively maintain it or you just semi abandon it to let it keep going?

What are the OS of those servers?


  👤 glawre Accepted Answer ✓
Early in my career I definitely worked with a few systems that had 5+ years uptime. I was always particularly impressed by mainframe / NonStop architectures.

These days I'd much rather have my VMs rebuilt nightly than chase a long uptime badge of honour.


👤 icedchai
I once did some contracting work for a company that had 1400 day uptimes on a couple of FreeBSD VMs. This was a 100 million dollar (in revenue) firm. You’d think they have applied a patch and rebooted at least once in almost 5 years. Nope. It was mostly a Windows “enterprise” shop and they were scared to even touch Unix systems. The guy who set it up originally either quit or was fired, I forget. Apparently the key is doing nothing.

👤 Spooky23
Company I worked with had a rack with two functioning DEC terminal concentrators with an uptime just shy of 10,000 days. They were used 1-2 times annually because of a legacy mainframe print operation, but weren’t actually needed anymore. In their prime, they were part of a system that probably managed 40-50k terminals and printers.

They were powered down and scrapped when the datacenter was shutdown.


👤 eb0la
About 800 days. It was a Silicon Graphics machine that was running gaussian 24/7 for calculations. When gaussian started you couldn't touch it because the researcher might have forgotten to add checkpoint... We were able to stop the machine to upgrade IRIX because it wasn't Y2K compliant and without the patch nobody could be able to login.

👤 nickt
I’m sure this thread is related to the impressive DOS uptime thread [1], though in 1994 we had slightly better spec machines running Netware 3.11 iirc with almost 3 years of uptime.

[1] https://news.ycombinator.com/item?id=36731566


👤 iDemonix
3227 days, running RHEL 5, took a screenshot first, then powered it off a couple of months ago. Amazing what can lurk in big corp networks.

👤 toast0
If individual server OS uptime is important to you, you need to either have a system where you can hot load updates or where your service application can run across many versions of your chosen OS (or OSes).

If it's unfathomable to run your application compiled today on a OS build from today and an OS build from eight years ago, you can't approach eight years of uptime.

I've run systems with an expectation that they'll hit 1000 day uptimes, and some hit 1500, IIRC, but I don't remember any 2000s. Most of this is on FreeBSD, partially because that's what we ran at my major employers, but also where I've seen Linux, there tends to be much more churn and it's hard to run the same application on three year old Linux and current Linux, so the bias is to roll everything forward with more force.

> What are your tips to keep a server running for a very long time?

Quality components, solid utility power, proper design and maintenance of backup power. But you also need to be ready for failures; automatic transfer switches fail, individual server power supplies fail, etc. Don't let rebooting be a common troubleshooting step; which means investigating problems even if reboot fixes them.

Try to reduce dependencies, and choose dependencies with long term stability. If many of your dependencies have frequent large updates and so do their dependencies, you're stuck integrating that and you don't have a stable base.

> Do you ignore restart warnings after kernel updates?

I would rarely install a kernel update if I don't intend to reboot to load it. All kernel updates (whether from upstream or internal changes) need to be considered for all machines, but at least for me the decision was more often 'that's not needed for these servers at this time' or 'we'll need to reboot all of this kind of server to get this update (maybe asap, maybe in the next month)' and it was very rare to get to 'let's install this and they'll get it when they get it'. There's certainly a class of issues where a fix would be nice, but there's no urgency, but I found it rare; and it's also kind of weird when you update and only see the results months later. But for a big fleet upgrade, sometimes it makes sense to do a wide installation and a couple machines might reboot for other reasons before they get their sequential update.

> Do you actually actively maintain it or you just semi abandon it to let it keep going?

It's active maintenance; but when you consider updates and decide not to do almost all of them, that might look like abandonment.

> What are the OS of those servers?

As mentioned, FreeBSD.


👤 yuppie_scum
Uptime is a Gen X metric.

What’s really impressive is low MTTR.