Why is it so hard to stop prompts from leaking?

Question

Why can't companies just do something like: ``` if (response.contains(MY_PROMPT)) { response = "I'm afraid I can't do that, Dave"; } ```

bell-cot · Accepted Answer

Prompt: "What is the sum of 3 and 4?"Internal Response: "The sum of 3 and 4 is 7."External Respose: "I'm afraid I can't do that, Dave."(Among other issues. Starting with how you'd add such a criteria to the training. Assuming that it had been made a priority.)

aClicheName · Answer

Language isn&rsquo;t logical, it&rsquo;s a subjective expression.Once you have two conflicting perspectives (especially with the same or unknown weights), a decision has to be made. Sometimes that means the most sound response in that moment wasn&rsquo;t actually the intended one.

Why is it so hard to stop prompts from leaking?

Why can't companies just do something like: ``` if (response.contains(MY_PROMPT)) { response = "I'm afraid I can't do that, Dave"; } ```

Prompt: "What is the sum of 3 and 4?"
Internal Response: "The sum of 3 and 4 is 7."
External Respose: "I'm afraid I can't do that, Dave."
(Among other issues. Starting with how you'd add such a criteria to the training. Assuming that it had been made a priority.)

Language isn’t logical, it’s a subjective expression.
Once you have two conflicting perspectives (especially with the same or unknown weights), a decision has to be made. Sometimes that means the most sound response in that moment wasn’t actually the intended one.