I'm also not exactly convinced when some CEO fires 100 employees in these market conditions, and then says how AI is helping them. Sorry, but these aren't exactly companies that are pushing the needle forward (mostly BPO and low margin non-tech businesses, or companies that have been around a while and still don't seem to get anywhere). It isn't that sexy to say we fired 100 people to save $5 million opex, I'm sure they want to raise more money slapping "AI" onto their brand.
So what's next?
Are you working on something interesting that pushes the boundaries? Or what are you following that some of us don't know much about?
Then we had RNN, which have the output fed back into the input. This gave the networks some form of memory.
Then we had Transformers, which are basically parallel processors. I.e generate 3 outputs in parallel, multiply them together. This was basically just a better form of compression applicable to everything.
The general trend here is that someone discovers some architecture that works out nicely and then everyone builds something around it. This is probably going to be the future. Google has some neat things with automated robotics, OpenAi has their A* stuff thats supposed to be "accurate" instead of probabilistic.
Then there is the hardware piece, which I know much less about, but hoping companies like Tinycorp or Tenstorrent give us a way to reliably run something like GPT3 full parameter model at home.
Also preferably something that wouldn't trigger the third contest about who can waste the most energy (after first crypto's proof-of-work insanity and then trillion-parameter models that needed a datacenter full of H100 GPUs just for training) while the climate change trajectory is already becoming more and more dire and we all really ought to reduce energy usage as much as possible.
I am not sure I get exactly what you are pointing to, but synthetic logical answers are:
-- the lucidity after the fever, and
-- striving to implement what was missed. Plus,
-- understanding why what somehow works does work, and build knowledge on that.
For the first: LLMs will be placed in dangerous places as flexible but improper frameworks as a practice that will be deprecated by the community, and will in parallel be implemented in the places where they actually fit (we discussed a few possibility yesterday in these pages, for example).
For the second: the implementation of artificial morons (some handfuls of months ago) makes the "real thing" - something that reasons, that thinks organically, that criticizes its own contents, that is reliable - more missed, so more investment will be done towards that progress.
For the third: research will continue exploring why LLMs can produce surprising results (that some have linked to "scale"); knowledge built in this effort will eventually lead to progress.
Meanwhile, real needs will be tackled by businesses.
(If you can clarify what I missed in the question, please do.)
- non generative hierarchical architecture (see LeCun JEPA)
- mixing deep learning with symbolic methods (usually search over token or latent space, it can also be a coupling with symbolic engines. Coupling LLMs with wolfram was a very early example)
- ability to perform unbounded computation per token inside a deep network. Something like Universal Transformers
I am not entirely sure if this'll be the future, but having one model that has relatively good performance over a huge range of types of data feels pretty good tbh
Trust/truth apocalypse is the most concerning near-term
Color me skeptical but the probabilistic nature of LLMs is at their core and is a hard limit to how useful they can be in wide applications. Currently their input and influence are limited to bulshitting - useful for mass cheap propaganda, seo spam and writing soulless student essays or slightly wrong boilerplate code.
In an informational landscape that is the equivalent of an infinite garbage dump, purity becomes priceless — the new unobtanium.