How well do LLMs scale? Can we expect one LLM instance per person in the future?
👤 brucethemoose2 Accepted Answer ✓
Yeah. LLaMA-7B can already run on a phone, and mobile RAM/AI compute is scaling pretty well.
I dunno if it will be "common," as most companies will keep their models and tuning in the cloud. There needs to be a concerted community effort to run stuff locally.