What's the best personal ML setup these days?

Question

I think I'm looking to do some fine-tuning on SDXL, Llama, Mixtral, as well as play with some different RAG ideas and setups.Should I get a 4090, or use a cloud host? Which one? Will my 4x2080tis still be useful?

azeirah · Accepted Answer

Check out https://reddit.com/r/LocalLLama for better discussion on this topic.
General rules of thumb
* VRAM is king
* Newer GPUs are better than older GPUs
* Nvidia has better software support than AMD
* Cloud is probably cheaper than buying your own hardware, only buy your own hardware if you want to tinker a LOT or care about privacy/other reasons.
* The best affordable hardware are Apple M-series processors, system RAM === VRAM due to their integrated architecture. Memory bandwidth is really high.
4x 2080ti has 44GB of RAM which is really nice for this kind of purpose. Mixtral should run very well on that, SDXL should run without a problem too. In general I wouldn't really recommend looking at upgrading from that set-up unless you're running into issues with cooling or electricity costs.
The be-all-end-all bottleneck in AI right now is memory and memory bandwidth, it doesn't really matter much if you have a faster or slower GPU. A single 4090 is a clear downgrade over your 4x2080ti for instance.
For llama and other LLMs you can basically run any model on that other than 70b models, though 70b should run fine quantised at .. 6bits? I'm not super sure about what 44GB gets you, but it should get you pretty far.
I personally got a 7900xtx because it's in a nice sweet spot between affordable and high end. I don't have the money for a multi-GPU set-up. Second-hand 3090s are in a similar position.
These are the upgrade paths I'd consider in your position
* Get at least 2 but preferably 3 3090s (48GB is not a big upgrade over 44GB, 72GB is.)
* Get a m2 ultra/max with at least 64GB memory. If you want to run the highest end models (goliath 120b, falcon 180b, ...), get the 128GB variant.
* Play with cloud until you get an idea of where your needs lie
You can check out some of these cloud providers, they're all fine.
- runpod.io
- tensordock.com
- lambdalabs
- vast.ai