Does it make sense to have something akin to `robots.txt` for GPT trainers? or is it better to let it infer the product for itself?
You're probably thinking of JSON-LD, which obviously already exists, but is not some golden truth. Search engines have been using machine learning to understand the content and topics of sites for a long time.
GPT is token prediction, you still want to do all the same things for your website to aid discovery and understanding.