HACKER Q&A
📣 mrtranscendence

Why do so many assume we’re on the cusp of super-intelligent AI?


It may just be the circles of the internet I’ve been hanging around in, but it seems very commonly assumed that the pace of progress in the field of AI is so rapid that — unless we take steps to avoid it — someone (or more likely, some large organization) will soon develop a super-intelligent AI.

And it’s not just the fringe. We can see see such apparently sober voices as Geoffrey Hinton sounding the alarm about super-smart AI.

But … why? All the recent exciting progress has been due to large language models, which are basically hacks that allow us to use large volumes of normally-intelligent text to train a statistical model for next-token prediction. It’s not even particularly complex to do this in principle (though wrangling all the compute you need can be quite difficult). It’s all very clever, yes, but at bottom it’s just a brute force approach.

These solutions get us neat tools but I don’t see how they bring us even one step closer to super-intelligence. You can’t just train an LLM with more and more parameters and more and more tokens and expect it to be smarter than the data it was trained on. And such models don’t bring us any real understanding of what it would take to make super-intelligent machines.

But if Geoffrey Hinton is worried surely I’ve gone wrong somewhere. What am I not seeing?


  👤 digging Accepted Answer ✓
I'm not deeply in this space, but from what I understand, the problems are thus:

1. Progress in LLMs has come much more rapidly than expected. This means that when [arbitrary threshold] is crossed, we probably won't have much, if any, advance warning.

2. Nobody on earth knows what the path to AGI (or even narrow-but-still-superhuman intelligence with enough agency to be dangerous) looks like. So, it's not currently possible to know if LLMs are a path to existential threat. (No, it is not correct to say that LLMs are too simple to ever be a threat, as far as I can tell.) (Recall also that we don't know at all where we get our own consciousness or how it works.)

So it seems like #2 is more where you're hung up, and frankly, it's just unknown. If we knew what the path to AGI looked like, we woul be in a very different world. When you combine with #1, it becomes very scary because we might be on the cusp of an irreversible change, so it's useful to assume we're there if doing so allows you to potentially alter or avert disaster.


👤 ftxbro
> "You can’t just train an LLM with more and more parameters and more and more tokens and expect it to be smarter than the data it was trained on."

This is wrong.

I saw so many people who think this though, even smart people. But it's just clearly not true if you think about it. It's like saying that you can't train a model to predict a trend from a scatter plot because the model can't be smarter than the average point in the scatter plot (or even the smartest one), and points in a scatter plot aren't smart at all.

I think the gap in understanding is that people aren't used to 'models' being treated themselves as 'data points'. So when they imagine a model being trained over model-like data points, they start getting confused between what is a model and what is a data point, and they start thinking that the model being trained can't be more capable than the smartest (or some even say average lol) data point (which is itself a model) in its training set.

Another reason why this kind of thinking is unintuitive is because of the raw scale of these LLMs. The good ones like the first ones that people are saying might become super-intelligent are going to be entire data centers, or data center sized supercomputers like Aurora, and they will cost billions of dollars in training. During training they will have more than a trillion parameters and trained on more than tens of trillions of tokens, so more than 10,000,000,000,000,000,000,000,000 numerical updates during their training. That number is just very large and outside the realm of human intuition for things like running through a for-loop in your mind when you are imagining the algorithm.


👤 sharemywin
I think it's very hard to see what happens when these things can handle video and become truly multi-model.

you would get a very powerful world model, I think. There don't seem to be any sensors you can't hook them up to.

could it learn to infer left from right, object permanence, things like falling and gravity?

Another thing to look at is large organizations accomplish much more than any individual in them, even the CEO. Revenue per employee in tech companies has continued to grow, how far does that ratio increase?


👤 arisAlexis
It's not only Hinton but Yoshua Bengio too that wrote a great article that was downvoted in olbivion here while he has the higher h-index in computer sciece right now. Should make you suspicious tht there is some psychological bias among tech right now because the alternative is very scary. This bias is called normalcy bias.

👤 qazwsxedchac
> sober voices as Geoffrey Hinton sounding the alarm about super-smart AI

Can someone please explain to me what exactly the danger is / the dangers are of "super-intelligent" AI?

AIUI, an AI is a combination of hardware, software and parametrization. In broad terms, it exists as a black box which supplies to humans responses to token sets fed to it.

Even if an AI has the launch codes for ICBMs somewhere in its training data, it doesn't have an interface to the nearest missile silo to use them. It cannot commandeer the resources (hardware, space, cooling, electricity) it needs to operate, it is dependent on humans to supply those. So humans can pull the plug on it at any time.

Even if an AI were to become both sentient and nefarious, by what mechanism would it harm humans?

I'm genuinely looking for concrete examples of such a mechanism because I can't imagine any which humans couldn't trivially control or override.


👤 oldandtired
You ask a very pointed question and after reading the various responses given, I think it is quite obvious that many have forgotten a very simple rule from decades ago: Garbage In -> Garbage Out.

Two things to consider:

1). No computer system will ever go beyond its programming and for an artificial intelligence of any sort, this will be required.

I have worked in many systems over the last 40 years and all too often, when you are forced by circumstance or direction, the analysis of those systems have been shown to be essentially garbage presenting what appears to be reasonable results when those results are not at all reasonable.

2). No matter how much data we "feed" these systems, the data will contain rubbish in terms of what we are trying to do with it and this leads only to further rubbish being created.

What we need to be cognisant of in regards to these technologies is that they are or will be used by people who do not understand the limitations of these technologies and will believe they are far more capable than they really are.

I don't disagree that these tools can be or will be useful as an adjunct to our capabilities. But unless we are actually stupid, we should not be relying on these systems for critical purposes. Of course, this will happen as it has already happened many times in the past and we have suffered from the inherent stupidity involved - we (as humanity) are just too lazy in so many ways.

I have the privilege of watching my 5 month old granddaughter grow and when you see the intelligence inherent here and then compare with so called machine intelligence, you quickly realise that all of our efforts in machine intelligence development is nothing compared to what we see in the development of intelligence in human beings.


👤 ajuc
The fact that it's just brute force is making this worse. Brute force scales.

👤 PaulHoule
Many problems with language that seemed intractable appear to be solved by LLMs but that may be a bit of an illusion.

Note a "chatbot" is fundamentally a manual system that is animated and supervised by a user, when you start building something that works without supervision (say a webcrawler) you start to see these are not ready for prime time. Chatbots already have superhuman performance at seduction (your personal viewpoint, narcissism, "soul", are all impediments to this) and are very good at getting you to think a 70% full glass is almost 100% full. I think the "generate high probability text" bypasses the mechanisms in your mind that perceive incongruities. It will be scary when these are applied to romance scams, "pig butchering" and the like.

There are some NLP tasks (relation extraction) where LLM zero-shot performance is better than the status quo, but in these cases the status quo is preparadigmatic (fancy way to say "it sucks".)

There are two hype trains as to superintelligent A.I: (1) "A.I. Safety" teams in big tech that first made A.I. look important because it was dangerous, and then confirmed their own legitimacy by being sacked ("it is so dangerous they had to fire us to cover up the danger") and (2) an apocalyptic cult that has been preparing for this moment for almost twenty years.

My take is that the efficiency of these things is going to improve dramatically (you'll be using a specialized model that beats GPT-4 at your task that uses 1/1000 or less the resources) that autonomous operation will still require collecting 1000s of examples for training and evaluation. On another level though I think the performance will reach an asymptote and adding more data will lead to diminishing returns.

One of the most frightening situations an engineer can get into, and that people have the hardest time perceiving, is when a project is approaching an asymptote (say 97% done) where you keep working harder and harder to get to 94%, 95%, 95.5% and never quite get there because of a structural inadequacy of your plan. Livingston's book "Friends in High Places" has the best account of this I've seen

https://www.abebooks.com/9780937063064/Friends-High-Places-L...


👤 freddealmeida
I think most people are just seeing the speed of (seemingly) progress as a direction to AGI. But we are far from it. Though maybe we are seeing something interesting in linear world models recently emerging. But my guess is gradual progress and then all of a sudden we will have AGI. But that first part will take 20-30 years. at least.

👤 floxy
Should we be making a distinction between sufficiently-intelligent and super-intelligent? It seems like the near term concern would be sufficiently-intelligent. We can all agree that marketing, while currently crude, does have an effect. Some people are more swayed than others by bought-and-paid-for messages. But right now mass-marketing is pretty lowest-common-denominator. What happens when it gets cheap enough to dedicate one-humanish-intelligence 24/7 dedicated to learning about how to influence you to buy or vote. Monitoring what you look at online, where you go, people you frequently engage with, etc., and customizing/"fabricating" interesting "news" to consume, etc.. And then what happens when it is a trivial cost to put 1,000 human-intelligences working against you?

👤 brucethemoose2
> It’s all very clever, yes, but at bottom it’s just a brute force approach.

Eh, this doesn’t really matter. If one "brute forces" a construct out of many simple models that acts like an AGI, then (in terms of the danger) it mind as well be an AGI.

And that, btw, is a major point of concern. LLMs and other models are "dumb," frozen and monolithic now, but stringing them together is a relatively simple engineering problem. So is providing some kind of learning mechanism that trains the model as it goes.

> But if Geoffrey Hinton is worried surely I’ve gone wrong somewhere. What am I not seeing?

He was very specifically worried about the pace. Hinton has been at the center of this space forever, and he had no idea things would jump so quickly. And if he couldn't see it, how is anyone supposed to see danger right before it comes?


👤 NumberWangMan
> It’s all very clever, yes, but at bottom it’s just a brute force approach.

This is the bit that I think you should focus on a bit more. I don't think it's the case you need complicated, clever algorithms and architectures in order to get complicated, clever behavior.

If you start with particle physics, then work your way up to chemistry, and then biology, you can see how we start with very, very simple rules, but at each level there is more and more complexity. The universe "running" physics is the epitome of a brute-force approach. It would be a mistake to say that because the rules of particle physics are so simple, that nothing made of those particles could ever think.

Likewise, even though these models are just big arrays of numbers that we stir in the right way to make them spit out something closer to what we want over and over again, I think it's a mistake to say that out of that, can never arise something much more capable than humans.

> You can’t just train an LLM with more and more parameters and more and more tokens and expect it to be smarter than the data it was trained on. And such models don’t bring us any real understanding of what it would take to make super-intelligent machines.

Others have addressed the first point here, but for the second -- yes, that's true. As for the second, I think we'll have a bit of warning before we get to true superintelligence, but even now, I it seems to me like we have half of an AGI in large LLM models. They don't seem to be conscious, can't really evaluate its their thoughts except by printing them out and reading them in again, and are only superintelligent in terms of knowing lots of facts about lots of things. But I think we are probably going to figure out how to create the other parts and we'll be there.

I am worried that humanity is on a bit of very-high-inertia train of "more and more progress" without enough safeguards. It was ok in the past, but as our world gets more and more connected and new inventions get spread far and wide in less time than ever before, it's possible for damage to be done on a very wide scale before we can figure out how to counteract it. It also means that good things can spread in the same way -- but the problem is that it's not just the average that matters, it's the variance. It doesn't matter if you create and disseminate 9 out of 10 new technologies that are massively beneficial if the other 1 ends up with humanity gone or completely disempowered, and you can't take advantage of the good stuff.


👤 d--b
ChatGPT can map one situation over another, like when you ask it to tell a story in the style of x or y.

Basically that means that in the training of predicting the next token, gpt HAD to come up with some internal way of “modeling” such situation and “render” it.

This ability to model and apply the model to something else could very well be the core building block of general thinking.

Additionally, the progresses in AI have been shockingly higher than expected. That computers would beat go and the turing test was completely unrealistic ten years ago. We can’t rule out that AGI isn’t around the corner.


👤 whateveracct
People on HN have a tendency to state that hyped tech is the next big thing so they can be "ahead of the curve."

👤 rafaelero
I think the intuition is that we made a lot of progress and there are still a bunch of low hanging fruits to grab. Once we have very large context window (>1M tokens) and multimodal training we will unleash unprecedented gains.

👤 meghan_rain
> It’s all very clever, yes, but at bottom it’s just a brute force approach.

which, as history shows, is the only ML approach that actually works, and therefore it's scary