Code Llama 70B on a dedicated server
I want to rent a server with 128 GB of RAM for my web projects. But primarily for launching CodeLlama 70B models. Is this possible without video memory?
Yes, look at ollama. You'll be able to run CodeLlama 70B at Q6_K with lots of memory to spare. It won't be very fast though.