It made me think that the obvious way to learn prompt engineering is… to not learn it, but to use another LLM to do that for you.
Any experience with this? Happy? Unhappy?
I have spent hours and hours and hours and hours trying to get ChatGPT to be a little less apologetic, long-winded, to stop reiterating, and to not interpret questions about its responses as challenges (i.e when I say “what does this line do?” ChatGPT responds “you’re right, there’s another way to do it…”).
Nothing and I mean NOTHING will get ChatGPT with GPT-4 to behave consistently. And it gets worse every day. It’s like a twisted version of a genie misinterpreting a wish. I don’t know if I’ve poisoned my ChatGPT or if I’m being A/B tested to death but every time I use ChatGPT I very seriously consider unsubscribing. The only reasons I don’t are 1) I had an insanely impressive experience with GPT-3, and 2) Google is similarly rapidly decreasing in usefulness.
There's the obvious "create a stable diffusion prompt with all the line noise of 'unreal engine 4K high quality award winning photorealistic'" stuff which is pretty obvious.
Less obvious is using it to refine system prompts for the "create your own GPTs" thing. I used this approach for my "Chat with Marcus Aurelius, Emperor of Rome and Stoic philosopher"[1] and "New Testament Bible chat"[2]
I'm particularly happy with how well the Marcus Aurelius one works, eg: https://chat.openai.com/share/27323fe8-56e2-4620-8e4a-3ebf69...
For both of these I started with a rough prompt and then asked GPT4 to refine it.
I found the key was to make sure to read the generated prompt very carefully to make sure it is actually asking for what you want.
More recently I've been using the same technique for some more complicated use-cases: creating a prompt for GPT-4 to rank answers and creating prompts for Mistral-7B. The same basic approach works well for both of these.
[1] https://chat.openai.com/g/g-qAICXF1nN-marcus-aurelius-empero...
[2] https://chat.openai.com/g/g-CBLrOOGjA-official-new-testament...
It requires a bit of back forth but you can get great results. It lets you iterate at a higher level instead of word for word.
I also find that the prompts work better. Prompt engineering is often about finding magic words and sentences that are dense keywords from the training data and another LLM is going to be good at finding those phrases because it knows those phrases the best.
Here’s an example dialogue I was using recently to iterate on a set of prompts for generating synthetic training data for LLM training. (Inspired by phi-2)
https://chat.openai.com/share/51dd634b-7743-4b5e-9c3f-3d57c6...
[1] https://github.com/HumanSignal/Adala
[2] https://github.com/HumanSignal/Adala/blob/master/examples/gs...
[3] https://labelstud.io/blog/mastering-math-reasoning-with-adal...
The prompt gave much much better results than than the one I wrote.
In it, they have GPT-4 generate solutions to coding problems, but instruct it to insert backdoors into the solutions some fraction of the time. Then, they explore different ways to use a weaker model (GPT-3.5) to detect these backdoors. Pretty interesting.