How does an LLM perform addition?
How does an LLM perform addition?
It doesn’t it comes up with probabilistic options almost in a Bayesian sense, and the highest probability one is chosen. This is why its not great at it since its restricted to discrete outputs.