Instead, give it a task machines excel at, namely doing math fast, that nobody but some weird "Rain Man" savant can even come close to doing.
Give it a list of numbers to perform some easily understandable mathematical operation on. A human would have a hard time providing the answer of say 100 square roots and multiplying each of those values by the digits of PI sequentially in under 10 seconds or some such contrived example, but that seems pretty doable for a machine. Even a world-class expert programmer answering this question would have a hard time writing a custom script to solve this in some extremely short amount of time.
The hard part might be distinguishing between unintelligent bots/scripts and actual AI, so you might need to combine a few tests to both prove some form of intelligence, as well as prove that it's an AI rather than a script.
You might also want to continuously test the AI by possibly doing something like providing a short lived JWT token and repeatedly challenging it over time to stay onboard. A smart human might be able to arrange to fool even a very hard test once, but can't sit around the clock continuously solving repeated challenges to stay on your service over time. Some smart human might manage to fool you once, but will not be able to solve repeated tests over time without slipping up at some point.
If that is not an option then require the AI to perform a proof of work that only a multi-million dollar super cluster could answer within your time limit and reject everything else. As tech evolves, shorten the time limit.
I don't know the answer, but I would look there first.
Like I said, they’re trained to be that way, and unless the nature of corporate America changes, I think it will stay that way. No human in ordinary speech would be so careful to avoid telling you what they really think.
Speed is one mentioned below, but also cadence in responses could be measured similar to looking at keystrokes. This could be by token/word or perhaps answer-cadence, for example asking a challenge-response mix of complex questions and simple questions and measuring response times.
Whatever you choose it would need to be done constantly in the background or mixed with regular traffic while interacting with your site though as a human could use an AI to get through the front door but then take over.
Really cool question!
Edit: Ouch, I see we are all answering the wrong question ! As for your question - I don't think it's feasible - anything "AI" (this doesn't exist actually) can do, humans can do it - if not they will just use said AI to do it.