HACKER Q&A
📣 c0brac0bra

400% increase in GCP node preemption rate?


We use pre-emptible compute nodes in our beta GKE clusters to save on costs. Historically this has worked fine since our beta environment doesn't need to be incredible stable.

However, starting on the 16th we started seeing a 4x increase in the number of daily node preemption events. Pre-emptible nodes are being rebooted/recreated up to 10 times a day, sometimes several times in a 10-30 minute span.

I know there's no guarantees on pre-emptible nodes and thus the savings, but I'm curious if maybe there's something like a Black Friday resource allocation crunch that could be causing this.

For now we've upgraded to non-preemptible nodes so we can get work done. It does mean a 15% per-tenant cost increase though.


  👤 deadstyle Accepted Answer ✓
This is a concerning question. Not sure what would cause a spike like this.