What I usually do is I would have Claude Code build a plan, Codex find flaws in it, iterating until i get something that looks good. I’d give direction and make sure it follows my overall idea.
Implementation is working well on its own.
But this takes a lot of focus to get right for me, I can’t see myself doing it on the same project, multiple features.
Am I missing something?
(I ended up just using the claude web interface and making it use a checklist, took 8 hours)
These agents work best when you know what you want done, specifically on implementation and if you know what you are doing, some code (such as frontend) can be one-shotted >90% of the time with minimal checks.
Anything lower than that must be checked over by a human + agent, otherwise you will risk introducing a critical leak, bug or a new security issue.
As an avg dev, what I can even do with 10 agents? It like managing 10 toddlers who can code, it looks good but it becomes hard to manage as you have limited context in your brain.
2 is the best setup if you can afford. One can write Tests and other can write the code. This is better because if you just use the same agent instance, its not gonna be able to write good tests as it will just write tests that its code is gonna pass. Its different for everyone else, but for me, 2 is the best setup for TDD.
Apart from that, you can just go ahead and do it your own way. I have found that many senior engineers think they are special when they can make Claude Code do something, they think its with their setup but I am usually able to replicate without any setups or Agents.md/Claude.md, the models are good enough without any complex setup.
I use them only for early prototypes that we discard early , but can’t use them with legacy codebases because reasons.
For personal use vs code + GitHub copilot pro plus works great (highest limits available for code generation for 40$) includes has over 10 models
Because its been updated even just this past couple weeks - everything is there - agents - codex - claude
I only have 16 gb ram and im coding like 4 projects at once if im crazy enough
I’ll occasionally have it write a little regex for me, which it does a decent job with, that’s its main use.
It’s like a fire extinguisher that helps engineers manage the problems they created to begin with.