HACKER Q&A
📣 glimshe

Is there a great book for understanding LLMs?


I'm looking for a book that would help me to understand ChatGPT/Bard etc. Amazon is flooded with low quality texts and and complex, theoretical works about LLMs. I'm an experienced software developer with a broad understanding of CS, so I'm looking for something that won't hold my hand too much but also won't overwhelm me with equations. Maybe something like SICP (a wonderful book that manages complexity while keeping depth), but for LLMs.

Any suggestions?


  👤 thebuilderjr Accepted Answer ✓
If you're akin to the clarity and depth of SICP and looking to navigate the vast sea of books on LLMs, I'd suggest looking into "Deep Learning" by Goodfellow, Bengio, and Courville. While it does get into the math, it's renowned for being approachable for those with a technical background. It lays a solid foundation in deep learning without treating you like a beginner and builds up to topics like neural networks, which are instrumental in understanding the underpinnings of LLMs.

For something more tailored to LLMs and less math-intensive, "Artificial Intelligence: A Guide for Thinking Humans" by Melanie Mitchell provides a solid overview of the field while touching on the societal implications and philosophy behind AI, which might offer a refreshing perspective next to the more technical aspects.

Lastly, if you want the latest on GPT-3, OpenAI has a comprehensive paper that you can dig into. It's not a book, but it's written by the creators and gives you an in-depth look at the model's architecture, capabilities, and limitations without the clutter of third-party interpretations or excessive simplification. You can find it on the arXiv repository titled "Language Models are Few-Shot Learners."

Pair these with the original papers and blog posts from OpenAI regarding ChatGPT and you'll have a well-rounded view that's both deep and accessible. Remember to occasionally check out new content on arXiv and follow relevant AI researchers on Twitter for the latest insights. Best of luck in your endeavor to master the knowledge of LLMs!


👤 sk11001
No, LLMs move fast and the people who write good technical books

1) are rare

2) take time to write them

I also don't know if there is an incentive to spend a lot of time on an elaborate piece of work given how likely it is to be obsolete within 1-2 years. There's certainly incentive to look like you have written "the book" on LLMs which why the space is flooded with rushed, low quality books.

The good thing is that you can learn everything you need to know without a book on the topic - papers, tutorials, videos, code repositories etc.


👤 houseatrielah
Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out. We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3 https://www.youtube.com/watch?v=kCc8FmEb1nY

👤 MrCoffee7
Look at this book preview and see if it fits your needs: https://www.google.com/books/edition/Introduction_to_Large_L...

👤 max_
Read Stephen Wolfram's book on ChatGPT

👤 nrivoli
How AI Works: From Sorcery to Science Book by Ronald T Kneusel

Might give you some answers


👤 behnamoh
meanwhile, books are getting outdated thanks to LLMs. the irony...