Generally though, deep learning on large data is undeniably a very useful technique. We've seen huge advances in areas like language translation, voice transcription, OCR, visual defect detection, etc. and will doubtlessly continue to see it applied to a wide range of tasks. If the public and investors sour on "AI", it'll just go by another name as it did before.
I'm still amazed by the quality of Sonnet 3.5's outputs and OpenAI's realtime voice model. I expected the skepticism from this site's userbase but not to this degree. Anyone who has used Sonnet 3.5 and still thinks generative AI is just a "party trick" is either not thinking hard about AI's potential impact or simply in denial, imo.
I suspect we'll be in exactly the same place: incremental improvements in LLMs, but the corner case failures will be perceived as significant enough to preclude widespread adoption.
Maybe we'll go back to the "neural network winter" of the 80s and 90s where "anti-hype" seemed to make research into CNNs unfashionable.
Or maybe industry will deploy LMMs widely and just cope with their flaws.
That said, what _powers_ those models: encoders, attention, and transformers, has been a massive leap in AI model generation. This cannot be overstated!
I don't care about AI generating a token or pixel. What I care about is being able to throw very complex data at a model and have it "learn" relationships between them in ways I couldn't see before and then being able to perturb the data to see what how that changes the "meaning" of it.
For example, in biotech (my particular field), training models on the gene expression profiles of healthy vs. diseased tissues and then perturbing expression levels (in silico) to determine if doing so encodes the gene profile as "healthy" is ground-breaking. As data profiles of are added it has the potential to also reveal potential toxicities of gene inhibition (or expression) without ever needing to test it in the lab saving $100s of millions.
The party tricks are neat, and they are what bring in the $$. But it's the building-block use-case getting very little attention where all the benefits are going to be had in the coming years. And they are coming fast!
P.S. My personal prediction is that the next massive leap in AI is going to be a paradigm shift away from how we train and simulate networks. The current framework of more and bigger GPUs to process larger and larger models is unsustainable. Someone, somewhere in the next 5-10 years will revolutionize how this is done and THEN we'll have our true, AI "revolution".
Over the next few years, people figure out how to best impedance match it with the execution of AI workloads, and we're off to the races.... all those high speed transistors no longer have to sit around waiting for their turn to take part in executing code, and things get much, much faster.
But first... I need something more than a Pascal based emulator for the BitGrid, with no meaningful I/O.[2]
Maybe if I implement it as a blueprint in Factorio?
The only major disruption will be to anything content related. You will be able to generate blockbuster films/series for a thousand dollars. OF, P0rn basically for a few dollars. Instagram and tiktok profiles of fake people for a few dollars. Artists will be able to put out thousands of good songs a month and see what sticks, become big with one peraphs. You’re already able to generate articles/copywriting/books.
Since content will be free, the only value add will be in how much capital you have available to advertise.
AI is to digital content what china did to physical items. Only advertising money will make or break a product because anyone can go to an AI and have a great product instantly. Just like you can go to china and get a pretty good product instantly.
AI is amazing, but I think it will continue to be a relatively specific product for the near future. So 10 years from now we’ll probably revolutionize the UX of a lot of products on the market right now, but will the world look fundamentally different than it does now? I’m not so sure.
So essentially we gonna have better and cheaper smart google alternative and advanced code (picture, audio, video, text) auto complete.
Unless another breakthrough happens.
Which can happen tomorrow, or in 50 years. You never know.
1. Building blocks. Your core is GPT-4 level intelligence, similar to a transistor. Other things can be built around that. RAG. Functions. Agents. Building blocks lead to bigger ones. Other forms of blocks. Multimodal. Try out Suno and Midjourney if you haven't. Building blocks are composed of smaller blocks, and as engineers, this is where your salaries are made.
2. Amdahl's & the subsequent Gustafson's Law. More resources will have diminishing gains, until they hit a point where they unlock new uses. GPUs and cloud are a cool side effect from the computing revolution, and they solve all kinds of problems they were never invented for. You'll see LLMs slow down in problem solving then suddenly someone figures out how to build self-3D printing machines or something.
3. Old, uncool paradigms. Procedural generation is incredibly good at controlling quality. LLMs are like play doh, and proc gen gives it a skeleton. Evolutionary programming goes the other way. You can give an agent a problem to solve, and now it's able to figure out paths or branch agents on the way there.
4. Combos with existing blocks. LLM agents + crypto wallet leads to agent ecosystems. Multimodal vision/hearing can also be combined with all kinds of things. LLMs will be available to all apps, and it'll be cheaper than TTS was. RAG will make search cheaper, not enough to threaten Google, but you can quickly search interview questions or emergency medical treatments.
I'd say don't spend too much effort predicting. But play with new blocks as they form - several will be unimportant, some will matter. Two blocks together could be amazing. Brush up on the old paradigms that are no longer taught in average universities.
- <10 yrs Google maps will use photogrammetry + Computer vision AI etc. to make 'street view' continuous, and you will be able to go anywhere and look with any perspective. New layers can start to be added, e.g. infrastructure layers that are captured from crowd sourcing from phones.
- Music will be continuously generated < 5yrs. Within 10, can adjust to auto and manual inputs that nudge it in the direction you want.
- <5 yrs - Business role-based email addresses automatically parsed and routed. E.g. most business bills captured and added to accounting platform.
- Proper HUD AR glasses that are on an open-enough OS that allows hacking + easy app dev, so not a DOA product. Auto labelling of objects, live translation etc. will actually be well implemented and used by many people.
- War is going to look even more terrifying, and the Terminator Salvation vision of hunter/patrol drones manning the skies is precisely what front lines will look like (except higher altitude).
- Defensive war: Multi-layer iron domes start to be erected on nation borders, and we start to hand over to AI such that humans are not immediately/directly in charge of decisions. I.e. defence ops handled by AI, but tactics and strategy still directly managed by humans.
- Offensive war: Opposing nations will start to talk re: agreements about a human needing to make final decisions - To what success I have no idea.
- Low altitude airspace mostly managed by AI, drone swarms + coordination start to be seen as commonplace. >10 yrs Propellor noise to somewhat improve after massive push from citizens, and we eat some efficiency in the interest of reducing noise pollution.
- 20 yrs Maybe... First AI enemy, i.e. a nation state 'battles' an AI that seems to be acting alone or cannot be pinned on a specific human group actor. Will be possible due to the 'scale' that AI affords, not the 'smarts'.
- A Japanese citizen will be more likely to receive 'care' from a robot than from a human... This ratio mostly due to heavy use in eldercare.
- We will hear of edge-cases (similar but smaller situation to tang ping + incel choices) where some people explicitly announce / decide never to interact with humans again, and opt to just interact with AIs only.
- EU legislation that demands open/reviewable algorithms before products be legally available to EU citizens
- 20 yrs: AI 'Oracles' that some people revere (just very insightful, accurate, consistent AIs that appear to exceed human wisdom)
- 10 yrs: MSPs/IT providers will handle ~5x as many endpoints per employee compared to 2024
- Some very very crude prototype of 'inter-animal language translation' that people will want to talk to their dogs with, but it won't really add value for a long time and will mostly just be used for zoology research.