HACKER Q&A
📣 andrewstuart

Why doesn’t ChatGPT declare its source material?


Why doesn’t ChatGPT declare its source material?


  👤 jerojero Accepted Answer ✓
You mean as opposed to what Bing does?

ChatGPT is not a search engine, what it's doing is that it is completing the sentence you're giving it. It doesn't "search", it's simply activating neurons on its immense neural network to produce an output; this is influenced by what it has learned during training of course. But there's not any form of "source" material that it's pulling from perse. Everything is encoded in the weights of the neural network.

Presumably, what something like Bing does, I imagine is simply using chatGPT as an interface between the search engine and the user. So, in a way (and this is something these LLMs are capable of), it's searching for you. So i imagine it's more of a parsing of the content it finds online, rather than something that's already encoded in its neural network.


👤 drKarl
I'm not an expert on LLM's or on ML but I think that's not the way it works, it doesn't search sources, read them and write the answer like a human would do... I believe the way it works is it processes millions of documents, tokenizes them, that is, it splits and slices them into tokens, and then since it was millions, billions or trillions of documents, it can predict the most statistically likely token to follow given a context, iterating on that it can generate full text.

👤 leros
As I understand it, these models are deriving high level patterns out of the source material and the combining those patterns to produce the output.

It's like going to a painter and asking them to list everything that inspired their painting. The answer would basically be "everything I've ever seen".


👤 qup
Why don't you declare yours?