HACKER Q&A
📣 Mockapapella

Has anybody explored diffusion models as a basis for LLMs?


If you think of the characters as pixels, then you should be able to apply a similar process, right?


  👤 gwern Accepted Answer ✓
It is, as always, not quite that easy, because what is the equivalent of smooth continuous Gaussian noise for an ASCII character? What does a letter like 'z' jitter to/from?

But here is a bibliography of some relevant papers on diffusion models for discrete data which you might find useful: https://gwern.net/doc/ai/nn/diffusion/discrete/index