Are LLMs going to be replaced with something that's truly intelligent?

Question

Transformer models are great at text works, but I'm afraid we've reached a local maximum in terms of AI intelligence. Yann LeCun thinks in 5 years, we won't be using these models anymore. I wonder what alternative models are proposed in the literature? What other ways have we come up with to create something that is intelligent?

brucethemoose2 · Accepted Answer

Multimodal/multimedia transformers models have lots of juice left in them.
- The output can be much better with better input (multiple text blocks, images, video, sound and such).
- The collective training dataset is much better with labeled/associated video, audio, static images and text.
- I suspect previous models can be used to augment human datasets without too much risk of "AI inbreeding." For example, imagine asking an LLM to reword a text page, or a image model generate a variant of an image. Now picture this in a multimedia dataset, with (for instance) a model more accurately labeling video, or producing image variations with associated audio/text as inputs.

WheelsAtLarge · Answer

Not any time soon. They way I see it there's no defined path to get to an intelligent AGI. We just don't have the know-how right now. Eventually we might be able to do it since human intelligence evolved from nothing so there's got to be a way to do it artificially.There is the very minute possibility of emergence, that is the idea that if we unite multiple simple random systems they can create an intelligent system. Given how fast computer systems have become then there might be something there. But that's more like wishful thinking than not.

version_five · Answer

It will be something other than deep learning that brings us "intelligence". We've seen various jumps in the performance of neural network models, followed by lots of steady progress, and I expect llms will get better and better (in various senses) as research and engineering continues. But we've probably already seen the gist of what they can do, just like AlexNet or whatever isn't fundamentally different than current SoTA CV models, just less performant.