Will the era of using these to generate code end? Is the assume that the inference problem will be solved?
> Will we shift towards slower/worse LLMs running locally?
Bet on faster/better LLMs running locally and invest/brace yourself for recession accordingly.
They for sure are testing LLMs and checking the performance of local models. Once they reach a performance and quality enough for some tasks they will announce Apple AI or some variation of the name.
All of this is speculation, but I think is obvious the right way.
But that would lead to another competition on prices again.
So: when the money runs out and the bubble pops, we'll still get cheap existing models, what we lose is the race for new models.
We'd probably even keep free models: I forget where I saw it, but back in the early days someone noticed that models were so cheap that you could generate a decent sized blog post about any topic for about the same as the expected revenue from putting a few adverts on it and having it viewed *exactly once*.
That said, when (/if) these businesses stop chasing new models, it can make sense to burn the weights of the best at that date into a fixed (and analog, given how well they work with only a few bits of precision) circuit, making them more efficient. Not my field, so I'm not sure exactly how much more efficient analog can be; one or two orders of magnitude from what I've heard, but don't hold me to that, not my field.