How OpenAI arrived at 307,200 neurons in GPT-2?

Question

ofou · Accepted Answer

for the GPT-2 Medium architecture:Layers: 24Hidden Size: 1024Attention Heads: 16Using the formulas:Neurons = Hidden Size * Attention Heads * LayersParameters = Attention Heads * (Hidden Size^2 / Attention Heads) * LayersPlugging in the values:Neurons = 1024 * 16 * 24 = 393,216Parameters = 16 * (1024^2 / 16) * 24 = 345,471,744Therefore, for the GPT-2 Medium model, the number of neurons is 393,216, and the number of parameters is 345,471,744.

ofou · Answer

Neurons = (H * &radic;L) * (A/2)Parameters = A * (H^2 / A) * LIs that OK?