I have a problem where random nodes in a large cluster perform pretty far out of spec. What's the correct way to find them? There's a huge diversity of workloads, and the boxes are large so I was considering doing something really trivial that eats up some consistent % of CPU then graphing machines that are some n of SD out of normal (calculating fibbonacci to some low-ish number, for instance).
Is there any really good, clean way to do this that solves the problem in an elegant way?