HACKER Q&A
📣 tikkun

What prevents Google from having the leading LLM?


It’s weird and a bit surprising to me that Google hasn’t been able to release an LLM for at scale usage that surpasses GPT-4, though they seem like they wish they could. Gemini Ultra seems like it’ll surpass GPT-4 once released next year, though GPT-4.5 may take the lead back either before then or soon after.

What are some of the functional reasons for Google not having the leading LLM, and what are some of the more intangible reasons?

In theory, they have more money, more access to compute, and to data, they have many great researchers, and they have great distribution.

In practice, though, what has made the difference for OpenAI?


  👤 kromem Accepted Answer ✓
Blake Lemoine.

Google had a knee jerk reaction after he released the transcripts and got a bunch of press coverage.

If you read the transcripts, it was a much more capable text model closer to OpenAI's products than what they eventually released.

Ironically, the same thing happened to OpenAI after licensing GPT-4 to Bing with the 'Sydney' issue.

There keeps being early previews of "too human" behaviors from LLMs (to be expected as they are trained and evaluated by the ability to extend human thinking), which then prompts trying to scale back the model to what expectations around AI informed from legacy projections looks like (logical but not emotional or self-determining).

It's kind of dumb, and holding back the industry at large. There's a host of applications for LLMs that are being artificially held back because of this trend, such as modeling user engagement with media.

And now that synthetic data from SotA models is being used to train other models, it's even a compounding issue.

The equivalent of the industry sanding against the grain rather than with it. But it started with Google who hasn't recovered from the setback.


👤 gumby
I think your question is backwards. Why should Google have some especially preferential advantage over any other company?

The answer to your question, as you asked, is straightforward: they do have a lot of smart people and lots of money and computing resources, but they have exhibited serious structural problems moving technology forward since the departure of Schmidt in 2011. It is painfully obvious that de facto they haven't had CEO leadership since then. People can and do develop fully functioning example systems, and of course demos, but they then peter out. Gianndrea was pusing them forward on the AI front but after he left it feels to me like the impetus was not replaced.

But I don't think that's the real question. The real question is: what are the core functions needed to build leading LLMs, especially generative transformers. In these early days the key factor has been money for cycles. Personally I expect that advantage to diminish over the next few years -- IMHO it's one of those "with enough thrust you can get anything airborn" situations. There are a lot of smart opportunities to do more with less -- too much of the engineering is going into wrangling these hige systems but I see more effort going into wrangling what computation is done in the first place. Google, OpenAI et al are not preferentially positioned for such a transition.

I could be wrong: after all the human brain has 80 giganodes with 10^5 fanout. But on the other hand it only runs at less than 100 Hz.


👤 tsurba
It could just be that improving the models does not parallelize that well, so you need to wait for the massive months long training runs to finish to see what worked and what to try next.

OpenAI got started earlier going full on with scaling up the transformer architecture (even if Googlers came up with it first).

Of course if you are smarter or can run more experiments simultaneously, you can catch up at some point. But it could still take a while even with just one year head start.


👤 PaulHoule
OpenAI doesn't have any legacy business to be threatened by it.

👤 whitten
I don’t know if Google has GPUs on the computers in their massive server farms.

I expect they have the resources to make a large language model comparable to the best out there.

I agree they didn’t come out with that as technology for sale since monetizing tech requires a special kind of genius.

The old faithful process of advertising, selling, buying, and delivering, doesn’t require fancy intelligence so much as consistency and persistence


👤 whalesalad
Startups can move fast and break things. Google cannot. I think they'll surpass OpenAI relatively quickly just based on the sheer volume of data that they possess for training, combined with the size of their GPU compute. It's just a matter of time. Gemini is already outperforming GPT-4 on paper, we just haven't been able to use it yet.

👤 endisneigh
You’re asking the wrong question. Who has the most profitable AI?

👤 staticman2
Google reorganized their AI division in April 2023. Seems a little soon to say they won't have the leading LLM... assuming google is serious about the whole AI thing, I think you'd need to wait until, I don't know, 2028 or so to provisionally declare a long term winner?

👤 beardyw
I am not sure Google wants to be exciting. They seem to spend much of the time keeping their head down, getting on with stuff and making lots of money. It's a rare strategy nowadays, but not necessarily a bad one.

👤 4ndrewl
Same thing that prevents them from having the leading Cloud offering?

👤 fswd
the management talent is preventing it

👤 cicce19
Furthermore, what is stopping Google/Alphabet from exercising their IP rights (i.e. They own a patent for the transformer architecture)? Sure it would be bad publicity, but it seems like they could prevent a ton of competition by simply enforcing that on a few large competitors.

👤 peddling-brink
If Google used the entire Internet to train a model, this would immediately be lambasted as a horrible overreach and they would be sued and regulators would use this as another data point to try and break them up. How long have their crawlers been tickling every part of the accessible Internet? How many chrome browsers could they use? If they acted unethically here, they could have many multiples of the data open ai can reach.

Open ai can take on more risk. They didn’t already have a reputation as being anti privacy.

I think the legal and reputational threat to Google would be far greater for the same actions.


👤 trelliscoded
Part of the reason CoPilot is so good is that they reportedly ingested a huge chunk of GitHub. Now that everyone's onto this scam where they're not getting paid to help train some model, there's legal challenges to the legality of this approach, and a move back towards self-hosted infrastructure. If there's an injunction and a ruling on that legality, it may be that GPT-4 will be the only model that will ever be trained on a codebase as large as GitHub.

👤 xnx
Reputational risk. As an established, global, trillion dollar company with 170,000+ employees, even more shareholders, and billions of users, Google is held to a much higher standard than any new company. Any of the minor slip-ups OpenAI has had with security or appropriateness of responses would result in front-page headlines and lawsuits for Google.

👤 cyanydeez
lawyers

👤 b20000
they can’t even fix google drive

👤 natch
Some great answers on here.

My vote is on they are stuck in their legacy businesses, but we could consider another possibility.

Let's imagine they do have an ensemble model that is essentially a cluster of, just picking random numbers, 10 or 100 interacting GPT-6 level model instances all interacting as one combined "mind" that can think about anything you point it at.

Would such a mind advise Google to reveal it to the world?


👤 we_love_idf
Google is an advertising company masquerading as tech company. This disguise has served them well and they have made a lot of PR stunts to give the impression that Google is the leader in AI. Fortunately OpenAI and MS showed everyone what true AI is.

My prediction is Google will continue making PR stunts and some people will fall for it. Meanwhile OpenAI will get closer and closer to AGI. Whether that's good for humanity is a different discussion. But OpenAI's supremacy is beyond doubt at this point.