Right now, every standard take-home or HackerRank/LeetCode test is easily solved by LLMs. As a result, companies are accidentally hiring what we call vibe coders, candidates who are phenomenal at prompting AI to generate boilerplate, but who completely freeze when the architecture gets complex, when things break, or when the AI subtly hallucinates.
We are working on a new approach and I want to validate the engineering logic with the people who actually conduct these interviews.
Instead of trying to ban AI (which is a losing battle), we want to test for "AI Steering".
The idea: 1. Drop the candidate into a real, somewhat messy sandbox codebase.
2. Let them use whatever AI they want.
3. Inject a subtle architectural shift, a breaking dependency, or an AI hallucination.
4. Measure purely through telemetry (Git diffs, CI/CD runs, debugging paths) how they recover and fix the chaos.
Basically: Stop testing syntax, start testing architecture and debugging skills in the age of AI.
Before we spend months building out the backend for this simulation, I need a reality check from experienced leads: 1. Does testing a candidate's ability to "steer" and debug AI-generated code make more sense to you than traditional algorithms?
2. How are you currently preventing these "prompt-only" developers from slipping through your own interview loops?
(Not linking anything here because there's nothing to sell yet, just looking for brutal feedback on the methodology.)
Testing the candidate's ability to "steer" agents seems to be like testing their ability to know the Java API or to recite SOLID by heart.
> 2. How are you currently preventing these "prompt-only" developers from slipping through your own interview loops?
We don't ask anymore leetcode. We keep the usual systems design interview in which usage of AI is not needed (or at least we don't allow it because in this kind of interview we are more interested in seeing how the candidate thinks and so on)
We have a new stage in our job interview, though: generic Q/A about the fundamental of software engineering/computer science. Again, we don't care anymore how candidates produce code. We care about what they know, and what they don't know. What's the scope of their knowledge, and when do they need to rely on AI to come up with an answer. Silly (non-real) example: "Can you write a program that detects if another program halts?". The people we want are the ones who would say something about the Halting Problem but also perhaps be practical and perhaps ask more questions about such a program requirements.
You get the point: we look for people with a good breadth of knowledge, who can communicate well and know their shit. Whether they can use tool x or y (including LLMs), comes for granted for such people
So, as happened last week, if I’m interviewing for an Elixir dev I’m going to be interested in your knowledge of the BEAM and how it’s features can be used to solve common architectural problems.
“Tell me about the project that you are most proud of?” And then dig in and asking them about their challenges decision making processes, and gauge the level of scope, impact and ambiguity they know how to work at.
“I see you have been working for $x years. Knowing what you know now, what would you do differently?”
“Say you are in a meeting with myself, the CEO and other senior developers who have been at the company for awhile and we all agree on an idea that in your experience you know is a bad idea, what would you do?”
Follow up question: “What would you do if after we listened to you, we decided to go in another direction?”
“Tell me about a time when you had unclear requirements , how did you handle it?” - gets back to ambiguity. There is a lot of that with startups.