Is there a great book for understanding LLMs?

Question

I'm looking for a book that would help me to understand ChatGPT/Bard etc. Amazon is flooded with low quality texts and and complex, theoretical works about LLMs. I'm an experienced software developer with a broad understanding of CS, so I'm looking for something that won't hold my hand too much but also won't overwhelm me with equations. Maybe something like SICP (a wonderful book that manages complexity while keeping depth), but for LLMs.Any suggestions?

thebuilderjr · Accepted Answer

If you're akin to the clarity and depth of SICP and looking to navigate the vast sea of books on LLMs, I'd suggest looking into "Deep Learning" by Goodfellow, Bengio, and Courville. While it does get into the math, it's renowned for being approachable for those with a technical background. It lays a solid foundation in deep learning without treating you like a beginner and builds up to topics like neural networks, which are instrumental in understanding the underpinnings of LLMs.
For something more tailored to LLMs and less math-intensive, "Artificial Intelligence: A Guide for Thinking Humans" by Melanie Mitchell provides a solid overview of the field while touching on the societal implications and philosophy behind AI, which might offer a refreshing perspective next to the more technical aspects.
Lastly, if you want the latest on GPT-3, OpenAI has a comprehensive paper that you can dig into. It's not a book, but it's written by the creators and gives you an in-depth look at the model's architecture, capabilities, and limitations without the clutter of third-party interpretations or excessive simplification. You can find it on the arXiv repository titled "Language Models are Few-Shot Learners."
Pair these with the original papers and blog posts from OpenAI regarding ChatGPT and you'll have a well-rounded view that's both deep and accessible. Remember to occasionally check out new content on arXiv and follow relevant AI researchers on Twitter for the latest insights. Best of luck in your endeavor to master the knowledge of LLMs!

sk11001 · Answer

No, LLMs move fast and the people who write good technical books
1) are rare
2) take time to write them
I also don't know if there is an incentive to spend a lot of time on an elaborate piece of work given how likely it is to be obsolete within 1-2 years. There's certainly incentive to look like you have written "the book" on LLMs which why the space is flooded with rushed, low quality books.
The good thing is that you can learn everything you need to know without a book on the topic - papers, tutorials, videos, code repositories etc.

houseatrielah · Answer

Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out. We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3 https://www.youtube.com/watch?v=kCc8FmEb1nY

MrCoffee7 · Answer

Look at this book preview and see if it fits your needs: https://www.google.com/books/edition/Introduction_to_Large_L...

max_ · Answer

Read Stephen Wolfram's book on ChatGPT

nrivoli · Answer

How AI Works: From Sorcery to Science Book by Ronald T KneuselMight give you some answers

behnamoh · Answer

meanwhile, books are getting outdated thanks to LLMs. the irony...