HACKER Q&A
📣 AETackaberry

How to Learn LLM Fundamentals


As a Software Engineer that is interested in learning LLM fundamentals, what is your recommended course of action? Are there any courses or curricula you would recommend?


  👤 hislaziness Accepted Answer ✓
I really liked this series. https://karpathy.ai/zero-to-hero.html

👤 TroyZ
For a high level overview of transformers/GPT, you can check out this article: https://medium.com/design-bootcamp/how-chatgpt-really-works-...

The article provides references to the original papers and other articles explaining the subject, which can be great sources to dive deeper.


👤 bfg_damien
Here are a few resources that might help get you started:

1. A thread explaining the internal working of transformers: https://twitter.com/hippopedoid/status/1641432291149848576?s...

2. Paper by DeepMind which provides pseudo-code for important algorithms for Transformer models: https://arxiv.org/pdf/2207.09238.pdf

3. Another thread specifically on large language models: https://twitter.com/cwolferesearch/status/164044611134855577...

Once again these are not courses per se, but do provide intuitive explanations for how transformers work. There is also the nanoGPT series of videos by Karpathy on youtube. First video here: https://www.youtube.com/watch?v=kCc8FmEb1nY



👤 jytechdevops
What a coincidence, I actually ran across this github repo for LLM fundamentals today: https://gist.github.com/rain-1/eebd5e5eb2784feecf450324e3341...

gl on your journey


👤 is_true
I think this blog posts is nice overview to help you go deeper in the topics mentioned: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

👤 seydor
There s this guy who has made in depth videos about transformers https://www.youtube.com/watch?v=O3xbVmpdJwU