Instant answers or whatever they're called already produce direct answers plus they cite sources and provide links which is what everyone seems to think is the solution to the "LLMs make stuff up" problem.
Not to mention they're faster and cheaper to run.
Only truly practical use case I can think of is summarizing articles or writing them which makes more sense as a word processor or browser add-ons
People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.
Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer. And it frames it like the answer is completely, 100% correct and accurate. Because it's essentially really deep auto-complete, it's designed to generate text that sounds plausible. This isn't useful in a search context when you want to find sources and truth.
I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it. LLM's are really interesting and have come a long way by leaps and bounds... but I don't see how replacing entire institutions and processes by something that is only well understood by a handful of people is a great idea. It's like watering plants with gatorade.
Ideally Google search would have a flag to "follow my intent to the letter" and return empty if nothing is found. When you are searching for a specific thing, a response with other things feels like bullshit, Google trying to milk more clicks wasting your time. I don't mean exact phrase search, I mean exact semantic search.
This is causing issues when searching for bug fixes by ignoring the exact version, when shopping it will ignore some of your filters and respond with bad product suggestions, and when searching something specific that looks like some other popular keyword, it will give you the popular one, as if if has an exclusion zone and you cannot search for anything else around it.
"Minimum weight of a scooter with back suspension" -> matches information about carrying capacity. Of course more people discuss about max passenger weight than minimum scooter weight, but I really don't care about the other one.
Among other things, a LLM can be seen as a store which you query and get results from. A chatbot is cute because it formats output text to look like conversation, and the recent applications are nice because the query (now known as prompt) can be complicated and long, and can influence the format and length of the results.
But the cool stuff is being able to link the relatively small amount of text you input as a query, into many other chunks of texts that are semantically similar (waves hands around like helicopter blades). So, an LLM is a sort of "knowledge" store, that can be used for expanding queries, and search results, to make it more likely that a good result seems similar to the input query.
What do I mean by similar? well, the first iteration of this idea is vector similarity (e.g. https://github.com/facebookresearch/DPR). The second iteration is to store the results into the model itself, so that the search operation is performed by the model itself.
This second iteration will lead, IMHO, to a different sort of search engine. Not one over "all the pages" as, in theory at least, google and the like currently work. Instead, it will be restricted to the "well learnt pages", those which, because of volume of repetition, structure of text, or just availability to the training algorithm, get picked up and encoded into the weights.
To make an analogy, is like asking a human who are the Knights of the Round Table and getting back the usual "Percival, Lanceelot and Galahad", but just because the other thousand knights mentioned in some works are not popular enough for that given human to know them.
This is a different sort of search engine than we are used to, one which might be more useful for many (most?) applications. The biases and dangers of it are things we are only starting to imagine.
However, it's been observed, people are using chatbots for informational searches. The kinds of searches where you want to learn about a specific fact. This isn't all searches, but it's an important subset for web search. For better or worse, with probably a high degree of inaccuracy, people (probably rightly) perceive this is how people will seek information.
There's also the generational use cases - "write me a program that does X". Is this something people would use a search bar for? We don't know, and wouldn't know, until its out there for a while.
For the longest time the one natural language interface was as a search bar. So search vendors surmise it's important to both defend their turf while also a natural way to get regular users familiar with this kind of informational interaction...
1) Integration with voice assistants. Links/sources are irrelevant.
2) Models tuned against a particular body of work don't care if links go stale, or websites get SlashdottedHackerNewshuggedDDOSed, or etc. Links/sources are irrelevant.
3) Inbound "service requests" processed by something that can better understand the question and the available answers/solutions. Links don't matter much.
4) When "Okay what are some good websites to read more about this?" can be answered, too, bang.
5) Ever asked somebody a question and just rolled with their answer instead of demanding citations? I mean, you're doing it here. So, again, yes.
Another point is that I am either creative or productive at a time, but never both... at least aware of which state I am in. ChatGPT has proven to take over the other part surprisingly good. ie:
- when I am in a productive mood and stumble upon a thinking problem, generative AI is like on-the-spot creativity for "good enough" solutions, like naming a programming thingy or write some filler text around a few keywords instead of me looking for words.
- when I am in a creative mindset, I increasingly feed some code snippets into the bot and ask some questions to "fill in the gaps", like writing a specific function using library X, then to write a documentation explaining how it works, then to also write some unittests, and sometimes I even derail a bit or let the bot explain parts that stand out in some way so I can maybe learn a trick.
... And i used ChatGPT already in kinda emergency situations, like when I know 5 minutes in advance that I have to speak in front of a crowd/in a meeting and it gave me extremely useful outlines to quickly adapt to even when in a panicked mind state - calming me down through a given structure that sounded okay-ish, and it doesn't matter if the response is right or wrong.
Generating bullshit text off the cuff is not the only use of LLMs. LLMs can perform very well at classification, regression, ranking, coloring proper names red, and other tasks. You could, for instance, use LLMs to encode a query and documents and rank them with a siamese network, something not too different for how a conventional search engine works.
If there is one thing wrong with the current crop of LLMs it is that these can only attend over a limited number of tokens. BERT can attend over 512 tokens, ChatGPT over 4096, where a token is shorter than a word on average. It easy to process the headline of an HN submission with BERT, but I classify a few hundred abstracts of scientific papers a day. A long abstract is about 1800 words which is too much for longformer but would fit in ChatGPT if there aren't too many $10 words.
Unless you can recast a problem as "does this document have a short window in it with this attribute?" (maybe "did the patient die?" in a clinical case report or "who won the game?" in a sports article) there is no way to cut a document up into pieces and feed it into an LLM, then combine the output vectors in a way that doesn't break the "magic" behavior the LLM was trained to do.
You'd imagine ChatGPT would produce accurate results if you could tell it "Write an review of topic T with citations" but if you try that you'll find it will write citations that look for real but don't actually exist if you look them up. You'd imagine at minimum that such a system would have to read the papers that it cites, maybe being able to attend over all of them at the same time which would take an attention window 100-1000x larger.
That's by no means cheap and it might be Kryptonite for Google in that Google's model involves indexing a huge amount of low quality content and financing it by ads that are a penny a click. A business or individual might get a larger amount of value per query out of a much smaller task-oriented document set.
When you're hunting for a particular fact, like "that bit of code i half remember seeing on a page 15 years ago", then I don't see anything for an LLM to add. Google had a pretty good index for that purpose about 15 years ago, but they've chosen to prioritize other goals since then. I dunno if anyone works "find things you're searching for" as a market now.
Which is an answer to your question: Does it matter if an LLM helps search the web? That's not what people are doing, that's not what these companies are selling.
LLMs have a chance to offer an oracle that doesn’t answer in cryptic or evasive ways, but attempts to just give an answer. The hallucinations are a huge flaw, but one that I’m confident will be addressed with other non-LLM AI approaches. But it’s the right user interface for answering questions - it answers questions with answers, not a pile of potentially relevant documents to sort. The augmentation with citations, especially if they’re semantically revenant rather than symbolically, is a huge plus.
I don't know if that strictly complies with your definition of "web search application". It's definitely going to save time for me, and not seeing a bunch of ads during the process is wonderful - to the point that I really could see myself paying for it if they decide to go that route and take away the "free" version.
The basic idea was to have enough metadata about web sites so that you could get programs to do something approaching Prolog-style reasoning about the content and meaning of the web pages.
With more advanced LLMs it looks like a slightly different approach to achive something like the semantic web idea.
I think the idea is to constantly feed the model with updates from crawling the web and have the LLM "digest" the content, apply some filters to remove bad stuff, and then provide a meaningful result to whatever queries it might be asked.
The issue I see with the chat approach is trust. I've seen so many examples of these models just making shit up now that I reckon regular use of them will eventually lead to mistrust between the human and the chat-bot. If you can't trust the answers and have to go and check yourself, it's dead as an idea IMHO.
> Search systems, like many other applications of machine learning, have become increasingly complex and opaque. The notions of relevance, usefulness, and trustworthiness with respect to information were already overloaded and often difficult to articulate, study, or implement. Newly surfaced proposals that aim to use large language models to generate relevant information for a user’s needs pose even greater threat to transparency, provenance, and user interactions in a search system. In this perspective paper we revisit the problem of search in the larger context of information seeking and argue that removing or reducing interactions in an effort to retrieve presumably more relevant information can be detrimental to many fundamental aspects of search, including information verification, information literacy, and serendipity. In addition to providing suggestions for counteracting some of the potential problems posed by such models, we present a vision for search systems that are intelligent and effective, while also providing greater transparency and accountability.
Shah & Bender (2022) "Situating Search". In Proc. CHIIR '22 https://dl.acm.org/doi/abs/10.1145/3498366.3505816
For quick overview answer, LLMs are great. It's not 100% correct, but mostly it is, and that is good enough for a quick answer. Currently google tries to show that and people object as it is stealing traffic from websites. i just need an answer, a coherent useable one. Eg: "What were the movies Scorsese got an Oscar nomination for?"
For suggestions, LLMs are just one more of those blogs and listicles that are already showing up in search. If LLM is updated that is. The difference is an LLM would customize the answer according to the query, unlike already pre written content. So, yes useful. Same goes with stuff like: "how to build an email list?" or "What is a effective sales strategy?"
For research, Google is more useful. I think we have all done that.
Another application which is not realized at this time because we never did it before is the ability to ask follow up questions (which a chat format enables well). Suppose you get an overview of how a quantum computer works, but it would take a lot of effort to ask a follow up question and get a direct answer via a search engine. Eg: "Why is there no point in going beyond thousand Qubits?"
There could be modifications like voice to text (a jarvis like interface), or a personal assistant thingy. But those are far fetched.
It will help immensely, and for places it does not, we will still google like we have done before.
While what they’re doing currently isn’t perfect, it does provide results that are at least traceable. I could imagine an alternate universe where they doubled down on marketing themselves as “the search engine that doesn’t lie to you” or “where answers are found, not stories”.
On the other hand, I do agree with people speculating that LLM-AI interfaces will seriously hurt Google's bottom line, e.g. reducing the space for search ads, which represent the majority of its revenue.
[0] https://www.nytimes.com/2022/09/16/technology/gen-z-tiktok-s...
Mostly this is just to calm me down because ChatGPT gives me the illusion that I'm interacting with a human. The current voice systems are infuriatingly bad.
It would be nice if CVS's phone system would actually listen to me and modify its output accordingly. "I already gave you my birth date. And NO, I don't need a COVID booster."
edit: I'd like to meet the person who sold CVS its prescription web-site and its voice system. Simply to marvel at them and the swindle they pulled off, delivering absolute trash and probably walking away with a king's ransom.
I could imagine the interface being similar to what we have today, but with it being much better at taking in full descriptions of what you want. If you want pictures of teddy bears, it could provide search results and AI generated ones. If you want programming answers, it could link to StackOverflow or just give you an AI generated answer with an explanation. Perhaps I am looking for a lively bit of free music to add to an indie game - it could generate that too.
I feel that this will eventually end search as we know it, but it will hurt the sites that are behind the search results far more than it will hurt Google. Google (or Bing, ...) can become the one-stop-shop for so much more than it is today
Imagine asking someone with cursory knowledge of a subject matter to perform a google search for you. This person would dig through thousands of results and weed-out the junk/SEO/content farm sites, so you'd get information that's more relevant. LLMs could potentially do this quickly, separating the wheat from the chaff. Would it be perfect? No, but it would be a significant improvement over what you see on Google today.
When doing research I need a good search engine. Find me the official docs, not the SEO’d blogs. Find me that podcast episode. Find me the exact article I remember reading 10 years ago. Don’t try to guess and half-arse a result just because it has more ads or better SEO. I’ll do the hard work of synthesis because I’m looking to understand something deeply.
Current search engines have gotten meh at this use-case. Or at least Google has.
When looking for a quick answer, I need a smart-enough agent. How old is this celebrity? What’s the air speed of an unladen swallow? Give me a deity that starts with G. What the hell is a “nepo baby”? What does this random emoji mean when sent by a 20 year old and what is it by a 40 year old? Who’s that actor in that show with the thing?
I don’t care about the source and I’m not looking to do research. Just tell me a good enough answer so I can get back to my conversation or whatever. Current search engines are pretty okay with this, but GPT is better.
The two use-cases are fundamentally different and trying to merge them is where things went wrong.
(Voice chat)
- LLM, find three articles from HN frontpage which I would find insightful based on my recent evaluations, summarize them in under half a minute each and then I’ll choose the order, while I commute.
- (…)
- Okay, read me a second one first.
- (…)
- That was a good one because well-written and compared alternatives. Now find funniest article of all time based on my long-time preferences.
- (…)
- - -
While it’s dumb enough to forget what and how you’ve evaluated recently, a hidden prompt could(?) fetch that out, e.g.:
- (system) please convert my previous article ratings into json objects consisting of article url, article id (…how to get it…), 1..10 rating, your summary and a string of tags.
Then these ratings may be saved for later and fed into a chat secretly as:
- (system) If I gave you this prompt: “ - (…) - My recent ratings were (…). - - - I have no clue if this could work, but if it does, well, that would be useful. Edit: it may be wrong, but we have enough mundane tasks which are better done wrong rather than not done at all. It has a great potential as an “occasionally bright secretary” archetype.
As an example, someone on HN posted a tweet of a guy who asked ChatGPT to draft a letter announcing a layoff while also announcing several executive promotions and quoting MLK Jr. Obviously, the example is facetious but the results were actually pretty good. Certainly good enough for a starting point or template for a real layoff announcement.
I'm sure this is a miniscule amount of total search volume but there are a category of searches for letter templates (think cover letters, resignation letter, etc) that ChatGPT could seriously replace today. And ChatGPT is actually better because of how specific you can get (e.g. "with an MLK quote").
I don't think LLMs are a threat to traditional search today or even in the short term but what will ChatGPT 50 (or equivalent) look like in 20 years...
That being said, I don't think ChatGPT or any single LLM can replace mainstream Internet search use cases in the immediate future. They might enhance the search experience for users
It would be useful if a search engine could find the top 10 different interpretations of the phrase in latent space, and offer the top results (with a means to pursue more) in each of those separate meanings.
For example: "hypertext markup" matches way too many things about HTML, and not enough things about marking up (annotating) hypertext
LLMs could make search much more powerful in this manner.
ChatGPT, on the other hand... is not a search engine, even before you consider its tendency towards BS.
The workflow I have now with ChatGPT and what I image it will look like in the new era of search is: query -> read result written for a human, not a search engine -> (10% of the time) check if the result is hallucinated.
Especially for basic questions where I know what the answer should look like, I'm really enjoying the new workflow.
That use case is the one that makes more sense. Given that ChatGPT frequently hallucinates the wrong answer and confidently tells you how it is correct with its inability to cite and transparently explain to you how it got to that answer tells you that it is results are untrustworthy and this AI bubble is again, pure hype created by VCs.
The only worthy AI hype that will change everything is open-source LLMs that are smaller models and are more transparent.
"Can you provide a reference for that? or What should I google to confirm that this is accurate?"
It'll give you something pretty close to a final citation. This has saved me literal days of work traversing documentation in the last 2 months.
I think it works amazingly well at least for instances when you can immediately verify whether the answer is correct (e.g. coding, drafting letters) and instances where it is a starting point for further research. These use cases are a significant portion of my searches, so I think it will be very useful.
But i think that "suggest me something" is going to be a big selling point too. Decision-making is tiring, and people are willing to give that power to a machine. Look at how tiktok, youtube and even facebook now are working, it's slowly becoming like a TV stream that you passively watch. "Tell me what to do tomorrow" is going to be a common question in a year or so
There's more to generative models than just the above, but how much of this hype cycle is substantive for end users or developers? You always had a query. Now you'll get answers in a different way.
Skeptical overall.
Doesn't feel as simple as the search world where you can just up rank some approved sources, unless there is somehow a way to just generate the language from approved sources and ideas.
It responded immediately with some working code.
With Google, I could have found the result but it would have taken a dozen clicks and probably 15 minutes of my time. (To be fair, I might have learned more in the process).
Siri or Google Assistant wouldn't have given an answer at all.
People want to believe this is as amazing as it appears, but it’s window dressing we still can’t intentionally separate relevant information from requests.
Good luck wasting hours of your time googling for that by hand.
That’s not using the search service for search
...until the rise of the new ecclesiastical class? These models are biased in several ways with the training set defining their world view and those who train them intervening in specific areas to bend the output to their will. They can be made to negate the old dictum of garbage in, garbage out to garbage in, gospel out - where the gospel follows whatever the (small-c) creator thinks the populace should know and (in extension) think.
In short I don't trust these models any further than I can throw them.