HACKER Q&A
📣 valty

Why is query, key, value in LLMs so hard to explain?


I have watched so many YouTube videos on this and no one seems to be able to explain it properly.

Each explanation is so dramatically different from one another as well.

I feel like its another infamously difficult to explain topic like "monads".

I am desperately waiting for a 3Blue1Brown video on transformers to hopefully resolve this ambiguity.

I am looking for a visual intuition, and something that tries to answer common questions and ambiguities that arise, and explains the history and why we do things this way.

The best approach I found currently is Serrano.Academy https://www.youtube.com/watch?v=UPtG_38Oq8o&pp=ygUUdHJhbnNmb3JtZXIgbmV0d29ya3M%3D. They try to visualize things in 2 dimensions with examples and show the linear transformations.

Karpathy had a unique way of conceptualizing it as a directed graph with a "communication phase" which further confused me.

For such a historic topic, I think we need a better explanation!


  👤 neximo64 Accepted Answer ✓
Try HeduAI on youtube