Are we ready for vulnerabilities to be words instead of code?

Question

Until now, security has been math. Buffer overflows, SQL injections, crypto flaws &mdash; deterministic, testable, formally verifiable.But we're giving agents terminal access and API keys now. The attack vector is becoming natural language. An agent gets "socially engineered" by a prompt; another hallucinates fake data and passes it down the chain.Trying to secure these systems feels like trying to write a regex that catches every possible lie. We've shifted the foundation of security from numbers to words, and I don't think we've figured out what that means yet.Is anyone thinking about actual architectural solutions to this? Not just "use another LLM to guard the LLM" &mdash; that feels like circular logic. Something fundamentally different.(Not a native English speaker, used AI to clean up the grammar.)

lielcohen · Accepted Answer

To be clear - I'm not really talking about my personal laptop. I'm thinking about where this is going at scale. When companies start replacing entire teams with agents (and looking at the layoffs, that's clearly the direction), those agents will need real access to production systems. That's the scenario where "just don't give it access" stops being an answer.

nine_k · Answer

Scams and "social engineering", as known for a long time, could be a good approximation.

raw_anon_1111 · Answer

It&rsquo;s really not that hard to secure agents. Just give them tightly scope API Keys, put them in front of your API and treat it like you would a user instead of behind your API.If I were to ever use Claude in a production environment for an AWS account for instance, you best believe the role it was running with with temporary access keys would be the bare minimum.

stephenr · Answer

If at this point you (where you may be a person or a company) still think relying on spicy autocomplete is a smart decision, I can't fucking help you, and you deserve whatever bad things happen to you.
This is akin to saying "we are fully committed to slapping together sql queries directly from request data, but I wonder if it's risky?"
Part of security awareness is knowing when something is simply not worth the risks.