This is already starting to happen. I have a Coral Edge TPU connected to a Raspberry Pi. Yes, it is limited, but considering the whole getup costs less than $200 and does stuff I couldn't possibly do 6 months ago, it is pretty amazing.
In the future, at least this aspect of it will be more commoditized. Edge devices will feel pretty much like they do today, but without needing a connection to cloud metal to do similar work. But we'll still have to verify its work all the time, because so-called hallucinations are not going away unless we program the models to self-check and verify their output.
What will keep this at a slow steady pace though, is that the cloud vendors really want everyone to use their services so they make money. So there will be some really souped up abilities that you'll still want/need to pay for through them.
But edge TPUs will be the norm very soon.
I think terms will emerge around querying LLMs similar to how Googling became a term. Peoples’ behaviors will change and they’ll query models more regularly for more use cases. AI generated contributions will become more culturally acceptable.
Also way more red tape.