HACKER Q&A
📣 b1n

Which open source license can I use to forbids GPT-3 model training?


Products like GitHub Copilot (https://news.ycombinator.com/item?id=27676266) use open source code to train their "intelligent" code complete suggestions.

I want to release some open source code, but I don't want to make the mistake of training my replacement.

Are there any open source licenses that explicitly forbid use when training an AI?


  👤 quantumofalpha Accepted Answer ✓
> Are there any open source licenses that explicitly forbid use when training an AI?

Github/OpenAI's defense is "training ML systems on public data is fair use" (https://news.ycombinator.com/item?id=27678354). Unless this assertion gets invalidated in courts I think they mostly don't care about wordage in your license

> I want to release some open source code, but I don't want to make the mistake of training my replacement.

Most of a software engineer's value is not in the code they write. By far and large employers care instead that you solve their problems. Code's just a by-product, means to achieve that.


👤 dragonwriter
> Are there any open source licenses that explicitly forbid use when training an AI?

No, though plenty don’t license for that purpose and/or would require any work thereby created requiring w license to use the same license (e.g., all copyleft licenses).

But since Copilot and others use publicly available (independent or license, not specifically open-source) code on the basis that such use does not require a license, the license isn't going to stop them. Are you prepared to sue Microsoft to prevent them from using your code? If not, given their position on “Fair Use”, you aren't going to stop them.


👤 kilodeca
Then that would no longer be an Open source license.

👤 __d
It's just another tool.

In the past, programmers didn't have compilers, or linters, or debuggers, or dependency managers, or ... etc. All the tooling a modern programmer uses could be viewed as having taken work away from humans.

Just like these tools, AI will help you write more, better, code. There's a long way to go until we can generate software from an end-user's description of what they want done.


👤 speedgoose
IMHO, you are not training your replacement, you are training some kind of assistant.

👤 admissionsguy
> but I don't want to make the mistake of training my replacement.

If that was possible, would you really want to spend much of your life doing work that could be automated away?