Ethics on fine-tuning LLMs on books and papers
I want to get started on fine-tuning LLMs to emulate certain styles of writing (e.g. online posts, blogs, etc.) - but I don't quite understand the ethical and/or legal implications.
What are some ethical ways of sourcing such data?
Thanks!
I've talked to people that think there could be a problem with my smart RSS reader that retains the content of RSS feeds and trains a classification model to predict if I like the article and not even generate anything.
I think ethical or legal implications might depend what you do with the LLMs.