HACKER Q&A
📣 nsfwais

Are AIs gonna get dumber?


I'm not much of an AI kind tech guy, but I've been having this question for sometime: content on the internet these days are just full of AI generated stuff. I'm not saying that would definitely be a problem, but won't LLMs get more and more overfitted if that's all they have to eat in the future?He would not likely grow up to be a nice kid if the only food he eats is produced by himself.Am I right? Did I get anything wrong here?


  👤 wtbdqrs Accepted Answer ✓
I'm neither smart nor a dev, but there is no need to feed the kid the internets data anymore, is there? The kid gets enough data fed to it by direct user input aaand that makes the kid preconfigured well enough to recognize and leave trash where it finds it, except if it can be up- or recycled but thats a long story.

Corporate data, research, books, blogs, any tokens the kid will train itself on will "feel" right in it's stomach and not "too heavy" or "too light" for its semantic mass. The rest of what the rest of the internet might have to offer in the future (comment sections) is so predictable, it would be a duplication of effort the next gen of AI won't waste any RAM on.


👤 verdverm
Some things to note

- the builders are well aware of the situation

- they are not training on the full internet, they are actually training on less than previously, a filtered subset produces better models

- training involves much more than text on the internet, textbooks are a great addition to the training set. Multi-modal, especially video, is expected to give them better world understanding. I suspect this will unlock the household robot

- they now have all the actual interactions (and feedback) with the LLM to add to the training, which is much more relevent and direct training data