Complex systems can be created through the composition of simple processes which are easily explained or modeled. Sometimes there are mysterious emergent properties in the overall system, even when we can explain the components. Other times, through investigation / science / engineering, we are finally able to explain the entire system. It might lose a little of the magic or mystery as a result, but the system itself didn't change. Instead our perspective and understanding changed.
On that note, until we can fully explain some of the workings of our own minds, I'm reluctant to write off "just predicting the next token" as an unimportant process. It's one way to explain LLM inference simply, but it doesn't eliminate the importance. It also doesn't account for as-yet unexplained things which may be happening as a part of training.
Terms are grouped according to their similarity, forming clustered islands of concepts and semantic comprehensions.
These clusters are then situated within an atmosphere, creating what is known as an embedding space.
To unleash and fully utilize the power of the LLM, we engage in a statistical game of hopscotch, swiftly moving from island-to-island, grouping-to-grouping.
[0] https://satisologie.substack.com/p/ways-to-understand-the-op...