What are your tips to keep a server running for a very long time?
Do you ignore restart warnings after kernel updates?
Do you actually actively maintain it or you just semi abandon it to let it keep going?
What are the OS of those servers?
These days I'd much rather have my VMs rebuilt nightly than chase a long uptime badge of honour.
They were powered down and scrapped when the datacenter was shutdown.
If it's unfathomable to run your application compiled today on a OS build from today and an OS build from eight years ago, you can't approach eight years of uptime.
I've run systems with an expectation that they'll hit 1000 day uptimes, and some hit 1500, IIRC, but I don't remember any 2000s. Most of this is on FreeBSD, partially because that's what we ran at my major employers, but also where I've seen Linux, there tends to be much more churn and it's hard to run the same application on three year old Linux and current Linux, so the bias is to roll everything forward with more force.
> What are your tips to keep a server running for a very long time?
Quality components, solid utility power, proper design and maintenance of backup power. But you also need to be ready for failures; automatic transfer switches fail, individual server power supplies fail, etc. Don't let rebooting be a common troubleshooting step; which means investigating problems even if reboot fixes them.
Try to reduce dependencies, and choose dependencies with long term stability. If many of your dependencies have frequent large updates and so do their dependencies, you're stuck integrating that and you don't have a stable base.
> Do you ignore restart warnings after kernel updates?
I would rarely install a kernel update if I don't intend to reboot to load it. All kernel updates (whether from upstream or internal changes) need to be considered for all machines, but at least for me the decision was more often 'that's not needed for these servers at this time' or 'we'll need to reboot all of this kind of server to get this update (maybe asap, maybe in the next month)' and it was very rare to get to 'let's install this and they'll get it when they get it'. There's certainly a class of issues where a fix would be nice, but there's no urgency, but I found it rare; and it's also kind of weird when you update and only see the results months later. But for a big fleet upgrade, sometimes it makes sense to do a wide installation and a couple machines might reboot for other reasons before they get their sequential update.
> Do you actually actively maintain it or you just semi abandon it to let it keep going?
It's active maintenance; but when you consider updates and decide not to do almost all of them, that might look like abandonment.
> What are the OS of those servers?
As mentioned, FreeBSD.
What’s really impressive is low MTTR.