Which GPU is good for AI LLMs?

Question

Which GPU is good for running AI LLMs?

ActorNightly · Accepted Answer

General go to is pair of nvidia cards with 24gb of ram a piece. Should be enough to run stuff like Mixtral 8x7b in 8 bit precision which is good enough. That being said, single 24gb card is fine enough 4bit precision models if you are using it for basic coding assistance.If you are interested in inference only, not training, its not really worth it to invest in cards. Use the online inference tools. And for training, even a pair of 4090s aren't going to be that good without a good CPU and lots of RAM to keep the cards fed as much as possible.

ClassyJacket · Answer

You typically want as much VRAM as possible for this type of application.
For example, Llama has versions that take 32GB of VRAM, even after quantization (compression):
https://old.reddit.com/r/LocalLLaMA/comments/1806ksz/informa...
There are smaller versions too however if you're VRAM constrained.

pizza · Answer

4090 or 3090 are reasonable choices. You want the GPUs with a lot of VRAM. Seen a lot of people running 2-3x used 3090s for reasonable-ish prices (ie

Which GPU is good for AI LLMs?

Which GPU is good for running AI LLMs?

You typically want as much VRAM as possible for this type of application.For example, Llama has versions that take 32GB of VRAM, even after quantization (compression):https://old.reddit.com/r/LocalLLaMA/comments/1806ksz/informa...There are smaller versions too however if you're VRAM constrained.

You typically want as much VRAM as possible for this type of application.
For example, Llama has versions that take 32GB of VRAM, even after quantization (compression):
https://old.reddit.com/r/LocalLLaMA/comments/1806ksz/informa...
There are smaller versions too however if you're VRAM constrained.