Has the LLM/transformer architecture hit its limit?

Question

Have we hit the limit for performance increases on the current architecture of LLMs?I&rsquo;ve heard some amount of agreement among professionals that yes we are, and with things like papers showing Chain of Thought isn&rsquo;t a silver bullet it calls into question how valuable models like o1 are it slightly tilts my thinking as well.What seems to be the consensus here?

razodactyl · Accepted Answer

I think there's still a lot of room "relatively" to move around but my current opinion is that hardware isn't where we would want it to ideally be to have next level LLMs everywhere.
We've seen the trend of distilling models at what seems to be the cost of more nuanced ability to iterate and achieve correct results.
I'm very convinced LLMs can go much further than we've achieved so far but I'm very welcoming of newer techniques that will improve accuracy / efficiency and adaptability.

ianbutler · Answer

Maybe better expressed as are we in a tail end of an optimization phase where we&rsquo;ll see long tail improvements but nothing generational

meiraleal · Answer

Hopefully. Then we can start to develop solid software on top of it.

cranberryturkey · Answer

i think o1 isworse than o4 for coding.