Has anybody explored diffusion models as a basis for LLMs?

Question

If you think of the characters as pixels, then you should be able to apply a similar process, right?

gwern · Accepted Answer

It is, as always, not quite that easy, because what is the equivalent of smooth continuous Gaussian noise for an ASCII character? What does a letter like 'z' jitter to/from?But here is a bibliography of some relevant papers on diffusion models for discrete data which you might find useful: https://gwern.net/doc/ai/nn/diffusion/discrete/index