Is there any research on this topic that anyone is familiar with and can share?
Personally I find it concerning the proliferation already of ChatGPT derived writing, and no doubt it is going to continue/increase. Are there (known) safeguards in its training model to try to filter out AI generated content in the data set to try and prevent this?
In other words, artifacts the model tends to generate or humans tend to miss will become more severe if fed into the input.
I think that is a significant concern, especially if the "GPT jank" patterns between models is similar.