Is there a resource that explicitly explains at a technical level the thought process behind a full implementation of a model like these? For example, why they chose to add an X-type layer here, or what layers they may have tweaked and why to improve 3.5 to 4's quality?
I understand OpenAI is closed source. But I'm sure there isn't some secret sauce that only employees there know about that makes these models happen, so where does one learn this?
If you're satisfied with a cursory understanding then you can read "derivative" material, otherwise you need to start with the basics of how modern deep learning models work and go deeper until you're able to read and understand more advanced stuff like transformers (and thus GPT).