What is your comfort level with code generated by LLM agents?

Question

With the rise of AI agents that execute code, I'm curious about the community's perspective on the idea of running code generated by these agents locally (Disclosure: I work on such agents). Specifically, what would make you comfortable (or uncomfortable) doing so?Some potential scenarios:- Agents from reputed big tech companies (e.g., Google, Microsoft, etc.): Would you trust code produced or executed by AI from such sources? Do these brands inspire confidence?- Open AI agents: How do you feel about running code generated by fully open agents, where the agent shows the code and the user executes it? Does the ability to inspect and verify the code give you peace of mind?- Audits and transparency: Does knowing that the AI agents and their outputs are audited by third-party experts change your comfort level? How important is the transparency of how the model operates?What threshold do you set for yourself? How often do you break this threshold? Would love to hear everyone's thoughts and experiences!

anonzzzies · Accepted Answer

I use LLMs for 80%+ of our code, but of course it is always human reviewed and tested. We use all types: llama, gpt, claude, gemini, replit, grok and others: we want to know which is the most productive (it's by a large margin sonnet 3.5 for what we do; gemini is terrible so far).We are fine with code generated to be executed immediately (has to be to be productive really), but once we are satisfied, we won't let the PR get get approved without reviews.

gregjor · Answer

I don&rsquo;t use LLMs to write code. I don&rsquo;t use them at all.As a freelancer specializing in fixing and maintaining software I welcome LLM-generated code. I look forward to a prosperous future with an even larger pile of bad code to work on. I especially like the subtle security errors that seem to plague LLMs &mdash; those mean premium emergency rates for me.

voidUpdate · Answer

I have never used LLM generated code so far, but I would be extremely sceptical about letting an LLM generate code and using it without full review. IMO LLM generated stuff should only be used as a starting point, not a final product