OpenClaw is supposedly a security nightmare, but is it?

Question

Two types of knowledge: Induction(from experience) and Deduction(from logic).Deduction: Can OpenClaw get prompt injected and delete your filesystem and send your money to a hacker? Yes, it has all the tools to do so.Induction: Has this ever happened? Not yet.Induction is probabilistically true. Deduction is either true or false.Someone prove me wrong, but a normie isnt getting a multi million dollar 0 day spent on them. And in the wild, openclaw seems to be doing fine.I'd argue there is some 99% chance that OpenClaw is going to be fine for me. (And that number is probably low)

aytuakarlar · Accepted Answer

For a weekend project or local use case, your inductive reasoning (probabilistically, no one is spending a 0-day on me) is totally fine. But the moment you move to enterprise, fintech, or any system handling real data, I truly believe that relying on induction is a non-starter.
The deductive risk (the fact that the agent can execute rm -rf or transfer funds if prompted maliciously) is actually very common. I am working with one of the top universities in my country to write a paper about that issue. We benchmarked 118 test scenarios, 1,062 API calls across GPT-4o, Claude Sonnet, and Gemini Flash. they all fail to consistently follow their own guardrails. The results will be published by if you are interested here are the charts:
https://github.com/akarlaraytu/llm-agent-policy-enforcement
We shouldn't have to choose between crippling the agent's capabilities and just hoping we don't get targeted. And I really believe that the solution is putting a deterministic governance layer between the agent and the execution environment.
This is actually why I started building a product according to that and I just published a Show HN here. You can check it out and if you are interested I can give you credit on my platform which is dedicated to restrict unsafe behaviors and decisions of AI agents.
https://news.ycombinator.com/item?id=47501849