HACKER Q&A
📣 bruturis

Some tips for evaluating intelligent agents?


A friend of mine is a solo developer, he is creating a big intelligent actors platform using LLMs. I think his platform is overly abstract and use a lot of calls to LLMs. How can one measure the increase in intelligent behavior of this platform versus vanilla GPT4?, I am thinking in same use case that would allow him to show the strength of his idea without having a huge cost.

Edited: googling I found this one (), but don't know about the cost of testing the platform.

() https://openreview.net/pdf?id=zAdUB0aCTQ


  👤 willd13 Accepted Answer ✓
What do you mean by an "intelligent actors platform"?

👤 aristofun
How do you expect to measure something abstract that is not yet even defined