HACKER Q&A
📣 specktr

Do you use ECC RAM in your home/personal workstations?


After coming across this[0] thread yesterday I'm considering upgrading my RAM on my Ryzen 9 3900X workstation to ECC to protect myself from data integrity issues in the future.

Would you suggest this route? It seems to me that there's little reason to upgrade if there have been no issues in the past.

What have your experiences been with running your workstations with ECC ram? Is it worth the additional cost?

[0] https://news.ycombinator.com/item?id=34470900


  👤 theevilsharpie Accepted Answer ✓
> It seems to me that there's little reason to upgrade [to ECC RAM] if there have been no issues in the past.

One of the benefits of ECC RAM is the ability to detect single and double-bit memory errors. In the absence of that capability, how would you know if you have a problem with your RAM?

> What have your experiences been with running your workstations with ECC ram.

For AMD desktop processors, there is a little more effort involved in building an ECC-capable system, because ECC is a feature provided by the motherboard maker, rather than a feature guaranteed to be present as part of AMD's platform. As such, you need to verify that your motherboard actually supports it.

Also, ECC memory largely lacks memory auto-overclocking features like XMP, EXPO, or whatever AMD called it in the past. As such, the memory will default to conservative bandwidth and latency settings (rather than the faster settings typically offered on enthusiast RAM), and it will be on you to overclock it if you wish to do so.

(On the positive side, ECC memory also largely lacks the "heat spreaders" and other plastic/RGB bits found on enthusiast memory, which means it's more likely to fit properly underneath a tower-style CPU air cooler.)

Lastly, unbuffered ECC can be somewhat difficult to find at typical consumer electronics retailers, and you'll likely have better luck getting it from vendors that specialize in server memory.


👤 stevefan1999
I don't. In fact we shouldn't have. ECC RAM is supposed to be serving services that requires extremely high RAS (Reliability, Availability and Serviceability). Any normal DDR4 RAM out here is already having a good enough RAS for a relatively long time.

As an anecdotal reference, my home "NAS" few weeks ago is running only with 16GB of cheap ADATA DDR4 2666MHz RAM ran for half a year non-stop, and I didn't even see a single RAS event in dmesg. It could be survivorship bias though.

---

For a typical home/personal workstations we normally run stuffs that is burst period (i.e. used during 9 to 6 and some high usage in 2-3 hours). It is also okay to shrug a few bitrots or even tolerate a system crash like if the butterfly effect hits.

So another reason I don't use ECC RAM on workstations: Remember that ECC RAM is notoriously long to do memory initialization and memory training.

Not sure if its an AMD thing but my friend booted his EPYC server with 256GB of DDR4 ECC RAM for half an hour.

You are using a "work station" so getting the work done as quickly as possible is more important than having a system too safe and reliable. Speed and work throughput is of utmost importance so ECC I'd say is okay to be sacrificed.


👤 jodoherty
I don’t think it’s worth paying a premium as long as you regularly checksum important data and look for changes over time, keep redundant backups, and regularly check for integrity failures. You should see the inconsistencies as an early warning sign. Just don’t set up your workflows so it’s too late by the time you see them.

That said, I’ve seen a not insignificant number of computers in the wild that couldn’t calculate valid sha256 checksums when utilizing vector optimized implementations. Who knows how bad other hardware issues could be. You just wouldn’t know. I would pay a little extra for ECC memory given the choice.

In fact, I have two Dell Precision T7810 workstations at home with 144 GB of ECC memory and dual Xeons totaling 36 cores on each machine as my two primary personal computers.


👤 chunk_waffle
All of the machines in my home lab do but sadly none of my workstations do (ThinkCentre and a ThinkPad.) I wish they did.

Some higher end Dell Precision's and at least at one time Lenovo/Thinkpads had models with ECC ram and if they still do next time I need a new machine that's what I'm getting.


👤 dusted
It used to be the case, when RAM were less reliable, that all workstation ram was ECC, it fell out of favor due to cost and performance priorities.

However, as both software complexity and ram availability increase, the likelihood of memory corruption increase as well, even with memory reliability being quite high.

I don't run ECC on my gaming computer, (I also run its hard-drives in raid0, I care about performance, capacity and cost-effectiveness).

I do run ECC on my file storage (zfs server) and on my workstation. It might not be strictly needed on my workstation, but I just don't want to think about it, and it's just another parameter towards more reliability.


👤 GianFabien
I have only used ECC memory with IBM Power and HP PA-RISC & Itanium systems in cluster configurations. Typically ECC errors only started appearing when one or more fans stopped working. Generally, other system and application errors lead to fail-overs being triggered.

For personal use, I can't justify the increased mobo + RAM costs for a minimal improvement in reliability. RAID for magnetic drives and UPS are better investments for increased reliability.


👤 abrookewood
It all depends on what each machine is doing and whether you care to pay the premium (and it is a premium unfortunately).

On my personal machine I like to live dangerously - it is running RAID0 and has non-ECC RAM - because everything is reproducible and I'm happy to rebuild it periodically.

On my home NAS, I do the complete opposite (running RAID1, ZFS and have ECC RAM), because it is the primary data store for things I care about.


👤 eternityforest
I wish everything had ECC, but sadly it does not. I do, however, have code on Pi based servers to allocate a few MB of random size every once in a while, give it a random value, and check it in a bit, to catch the worst always present errors. I've never seen it catch anything though, and would be very surprised if it did.

👤 mobilio
I'm not sure that you can upgrade to ECC: https://community.amd.com/t5/processors/problems-with-my-ryz...

ECC is usually reserved for high-end CPUs like ThreadRipper/EPYC.


👤 psychphysic
No, the performance per penny penalty is not worth it in the slightest for a workstation.

👤 physhster
I did when I had a cheese grater Mac Pro because Xeon, otherwise no. Unless I'm mistaken, if your memory controller doesn't support error correction, it is pointless to buy ECC memory.

👤 nix23
Yes since 12 years. Use ECC whenever you can.

>What have your experiences been with running your workstations with ECC ram?

Correctable bit-flips.


👤 classichasclass
Yes and supported (dual-8 POWER9, Raptor Talos II). Very stable. I'd consider it worth it.