I've found the most leverage with agent-based LLM solutions comes from including feedback from prior iterations.
For example, if you are trying to write a text-to-SQL bot you should actually attempt to parse any generated query using the desired provider and then pass any errors back into the prompt. You can also incorporate snippets of result sets (if successful parse) as another form of feedback. In my experience, this can make an incredible difference in performance. It's a LOT cheaper to just run the generated query against an actual SQL database (and try again as needed) than to get the whole thing right in one shot with fine tuning & RAG.
I do think the right approach is to empathize with the agent. How far could you get with that complex 12-table query if you did not have a way to run your SQL against the actual database and see how it performs? You can extend this thinking to any domain. You can write your own "guard rails provider" that is deterministic and easy to inspect (i.e. you can set breakpoints and view locals). This can even recursively utilize the LLM in a more targeted way.
Their implementation so far leaves something to be desired.
Agents enable iterative changes to the environment. In contrast, LLMs are just "one shot" outputs and cannot edit what they have already done.
I think this "iterative" paradigm will ultimately be successful because iterative improvement is also what humans do -- we are agents "storing" most of our "knowledge" in the "environment" and then making small iterative changes and improvements, with an overall goal in mind. This isvery much in line with the influential "extended mind" thesis https://en.wikipedia.org/wiki/Extended_mind_thesis
The bandwidth of working memory and attention for tasks, especially new ones, is extremely limited. So we offload much of our cognition - working memory especially - onto our environment: think about how you write an essay or do a math problem or write a program.
1. RAG your data 2. Magic 3. Agent business logic 4. $$$
where step 2 is very unclear.
Also agent architecture what is it? A basic FSM which in essence is a bunch of business logic/rules with LLM API calls, how do you make this reliable for transactions.
I've yet to see a decent example of a business process replaced which isn't a question answer scenario i.e. call centre type role.
I don't think they are overhyped. I think that it's easy to hook up an LLM to some functions and get impressive results, but I think to make a really good agent system there are some core pieces that need to exist around the 'agent' to enable sophisticated workloads that many are not actually building out.
Do you think AI agents are overhyped?
For me, the biggest issue is that the things I'd really like AI to do for me are things I would never trust a third party with for privacy reasons. As local/offline AI chatbots become better I'll use them for more things.
However, I'm limiting my tech news and social media intake and using the technologies solely for real-world problems[1]. In the near future, the new ChatGPT voice conversations will be stunningly good, and conversations will soon become duplex, not simplex.
1. You get out of LLMs what you put into them. I'll stand on my soapbox and state that "prompt engineering" is critical to getting useful information out of LLMs. "Garbage in, garbage out" as they say. One should also have a healthy dose of "trust, but verify" with LLMs, as well as with any other technologies (Eg. don't follow your turn-by-turn instructions into a harbor, don't take a nap while your Tesla self-drives, etc.)
I also don't particularly like the technology (morally/ethically) because of what it says about what our country and tech in particular prioritizes.
So huge disclaimer aside that I'm biased and find it disgusting...
Technically... we will see I suppose, I just haven't seen anything very useful personally yet. There are some cool automations I imagine you could come up with and it might (tiny possibility) have a market with the kinds of people that already find stuff like Siri/Hands free useful. Not sure you'd ever turn a profit, but building it into your platform as a large tech co. you might find some adopters.
On the other hand...I can totally see it ruining the internet and a bunch of other stuff I enjoy.
Maybe as like a fancy playwright or testing platform it could be useful, but I'm not a web programmer so I'm really not the right person to speculate on that.
Are they making incredible breakthroughs that will change civilization? Yes.
Are VCs throwing money at absolute incompetent nonsense? Yes.
It is both over-hyped and under-hyped at the same time.