I'm interested in understanding more about transformers/GPT/LLMs and more 'media-rich' generative AI like DALLE and Midjourney etc. (I assume they're linked, because they seemed to have breakthroughs and blow up at the same time, but I don't understand that at all) - but not 'prompt engineering' or specifics about tuning model parameters etc.
Can anyone recommend any resources for 'writing a CMS' vs. 'how to configure Wordpress and install a plugin', as it were?
(Prefer text, but understand it might be too new for good ones to have established themselves. In that case, something like OCW preferred to 'screamface'.)
Cheers!
yeah its youtube i know... but, its hand on too
for gpt/llm/ml in general https://karpathy.ai/zero-to-hero.html
it starts with writing back prop from scratch and take you through writing everything you need and training a gpt2 equivalent model in the end
I also thought his lectures at standford, on youtube cs231n 2016, were really good. they cover GAN for generation, but I think after that you can read the papers that were the source for diffusion models dalle and midjouney use
It balances theory and code, and builds from the foundation up, so you're never typing something without understanding it. Teaching method is text, diagrams, and code. Most lessons have optional videos, too.
It focuses on text models over image models (rnn, transformer, etc).
It's not 100% finished, but has enough to get you very far.
This is probably what you’re looking for.
Also two books: Data Science from Scratch and Deep Learning from Scratch. They are more hand-on, but you'll build all the low-level things in Python and learn a lot.