HACKER Q&A
📣 behnamoh

Are LLMs really a blurry image of the web?


GPT-3 has 175B parameters and takes about 800 GB disk space to store [0].

In comparison, it was trained on 570 GB of text data [1]

[0] https://en.wikipedia.org/wiki/GPT-3#:~:text=It%20uses%20a%202048%2Dtokens,shot%20learning%20on%20many%20tasks.

[1] https://balkaninnovation.com/how-large-is-gpt-3-dataset/#:~:text=In%20comparison%2C%20GPT%2D3%20is,570%20gigabytes%20of%20text%20data.


  👤 ftxbro Accepted Answer ✓
in a way yes and in a way no

yes: it would actually be theoretically possible (but openai wouldn't allow you to do it for business reasons and it would be expensive to host and run) to download all weights to a large inference server cluster, then you tell in a prompt to act like a simulacrum of the internet, then you could theoretically prompt it to serve fake web pages that are imperfect versions of the real ones. this is a very literal implementation of lossy compression of the web, and not really useful or practical.

no: the important part of gpt isn't its compression per se (although it has been known for so long that compression and equivalently token completion are ai-complete) but rather the important part is the vast amount and variety of cognitive capabilities that are unlocked. these include unlocking the ability to pass 'theory of mind' cognitive tests at human level, and to perform at human level or superhuman level at so many standardized tests and bar exam and medical certifications. it would be fair to say this is not 'just compression' except in the sense that literally every computation could also be described as 'just compression'.