Prevent ML like GPT from using public posts like this one?

Question

HN,Several things about GPT-3 and related models like ChatGPT bother me.First, OpenAI is not merely consuming content I generated (however relatively miniscule that is) but deriving value out of it.Second, it exposes people to analysis and scrutiny that can be used outside of the pubic domain. For example, if I used a throwaway account to talk about bad practices at an employer or gov agency, this technology can potentially be used to identify me based on writing style much more accurately than existing stylometry methods.Third, threat actors can use how a person speaks to accurately emulate them when social engineering users (which is worsened by deepfake video and audio).Lastly, when we author content on a public post, the rules and norms that exist don't consider the type of black-box analysis that can be used to a hostile end, so I don't agree that ML analyzing public posts should be covered under existing public domain rules and laws.My question is, is the EFF or some other org working with law makers to address this?There are both consumer protection, fair trade and security concerns about this. I have not been hearing this side of the discussion when talking about ChatGPT for example.Before ML scrapping, public posts were the equivalent of standing in public fully clothed, with ML scrapping it feels like someone in public is using x-ray to look a layer deeper and expose us in a way we did not consent.It would be great if social media and sites like HN had a way to tell google,openai,etc... similar to robots.txt "don't use this for ML learning". I feel like laws and standards are 5-10 years behind.

cahoot_bird · Accepted Answer

Interesting thoughts..I'm just not sure if it would make a difference even if it was outlawed.The algorithms are out there, and it will only be a matter of time until more organizations adopt them to make their own programs. The tech savvy will be able to get and use them eventually. The internet is worldwide and it would be out there somewhere where people could get it.It will cause problems and disruptions of some kind in various industries. Chat to generated target email will be a problem, with the need for verification and a source.. Eventually generated realistic videos too. I don't know what the solution is. It might be more levels of identity are eventually needed for something like email. We're also in a time people don't trust centralized sources.I just don't think outlawing it would sufficiently work.