Would you suggest this route? It seems to me that there's little reason to upgrade if there have been no issues in the past.
What have your experiences been with running your workstations with ECC ram? Is it worth the additional cost?
[0] https://news.ycombinator.com/item?id=34470900
One of the benefits of ECC RAM is the ability to detect single and double-bit memory errors. In the absence of that capability, how would you know if you have a problem with your RAM?
> What have your experiences been with running your workstations with ECC ram.
For AMD desktop processors, there is a little more effort involved in building an ECC-capable system, because ECC is a feature provided by the motherboard maker, rather than a feature guaranteed to be present as part of AMD's platform. As such, you need to verify that your motherboard actually supports it.
Also, ECC memory largely lacks memory auto-overclocking features like XMP, EXPO, or whatever AMD called it in the past. As such, the memory will default to conservative bandwidth and latency settings (rather than the faster settings typically offered on enthusiast RAM), and it will be on you to overclock it if you wish to do so.
(On the positive side, ECC memory also largely lacks the "heat spreaders" and other plastic/RGB bits found on enthusiast memory, which means it's more likely to fit properly underneath a tower-style CPU air cooler.)
Lastly, unbuffered ECC can be somewhat difficult to find at typical consumer electronics retailers, and you'll likely have better luck getting it from vendors that specialize in server memory.
As an anecdotal reference, my home "NAS" few weeks ago is running only with 16GB of cheap ADATA DDR4 2666MHz RAM ran for half a year non-stop, and I didn't even see a single RAS event in dmesg. It could be survivorship bias though.
---
For a typical home/personal workstations we normally run stuffs that is burst period (i.e. used during 9 to 6 and some high usage in 2-3 hours). It is also okay to shrug a few bitrots or even tolerate a system crash like if the butterfly effect hits.
So another reason I don't use ECC RAM on workstations: Remember that ECC RAM is notoriously long to do memory initialization and memory training.
Not sure if its an AMD thing but my friend booted his EPYC server with 256GB of DDR4 ECC RAM for half an hour.
You are using a "work station" so getting the work done as quickly as possible is more important than having a system too safe and reliable. Speed and work throughput is of utmost importance so ECC I'd say is okay to be sacrificed.
That said, I’ve seen a not insignificant number of computers in the wild that couldn’t calculate valid sha256 checksums when utilizing vector optimized implementations. Who knows how bad other hardware issues could be. You just wouldn’t know. I would pay a little extra for ECC memory given the choice.
In fact, I have two Dell Precision T7810 workstations at home with 144 GB of ECC memory and dual Xeons totaling 36 cores on each machine as my two primary personal computers.
Some higher end Dell Precision's and at least at one time Lenovo/Thinkpads had models with ECC ram and if they still do next time I need a new machine that's what I'm getting.
However, as both software complexity and ram availability increase, the likelihood of memory corruption increase as well, even with memory reliability being quite high.
I don't run ECC on my gaming computer, (I also run its hard-drives in raid0, I care about performance, capacity and cost-effectiveness).
I do run ECC on my file storage (zfs server) and on my workstation. It might not be strictly needed on my workstation, but I just don't want to think about it, and it's just another parameter towards more reliability.
For personal use, I can't justify the increased mobo + RAM costs for a minimal improvement in reliability. RAID for magnetic drives and UPS are better investments for increased reliability.
On my personal machine I like to live dangerously - it is running RAID0 and has non-ECC RAM - because everything is reproducible and I'm happy to rebuild it periodically.
On my home NAS, I do the complete opposite (running RAID1, ZFS and have ECC RAM), because it is the primary data store for things I care about.
ECC is usually reserved for high-end CPUs like ThreadRipper/EPYC.
>What have your experiences been with running your workstations with ECC ram?
Correctable bit-flips.