At the same time, judging by opensource advances (E.g. Qwen 3.6 27B), hosting a smart enough local LLM on 16GB VRAM (or equivalent) is increasingly becoming a reality. Lastly, I see most coding to be of intermediate difficulty, not beyond.
Seems to me it's a matter of time that people shift to free Claude Code type experiences, powered by local LLMs.
What do you think?
That being said I think an unpredictable variable here is how the companies building frontier models respond to what should be a noticeable inflection point in consumers turning towards locally hosted open weight models.
There is also a significant amount of compute that is being built out as we speak that should in theory reduce costs for providers of frontier models but that's a whole other can of worms.
Despite all of the very impressive open weight models that are available to us today, Anthropic and OpenAI continue to remain steps ahead of the competition. Most of the biggest and brightest minds in AI are working at frontier labs. It's not hard to foresee that these labs continue to maintain their edge given the amount of expertise and brainpower they've assembled.
Assuming frontier models continue to maintain their edge, even if it's on a subset of tasks (e.g. reasoning, judgment, planning), I see a convergence towards a hybrid workflow where both frontier and local models are used for specific tasks. e.g. Claude for reasoning, planning, judgment, with intelligent routing to cheap/free models tuned for certain tasks.
It went:
AI: "I see you are building a Django project. How can I help?"
Me: "When I click on the Reload button, it does not set the reload option correctly. Fix this"
<10 minutes>
AI: "I see you are building a Django project. How can I help?"
Needs more tweaking of the context window, I think.Seriously, I agree that this is the future, when OpenAI et al have gone bust.