I am part of a larger community, which organizes themselves through loads of E-Mails, PDFa etc. Many questions one has about the current state of affairs could be, in my opinion, done through a ChatGPT like interface.
How would one go about training a model based on local files? Is it possible? What would I have to do?
👤 brucethemoose2 Accepted Answer ✓
For non commercial use? To answer your question, finetune a llama based instruction model, maybe using the lit-llama repo. For this you will need to rent a pretty beefy cloud instance, and you will need to resume the finetuning (or use a LORA) to put new data in. Then host it on a cheaper server with a llama.cpp frontend.
But what you really might want is a vector search. This seems like a better fit.
👤 tikkun
There are some "drag and drop" type solutions, like https://www.chatbase.co/. There are various more - search for custom chatgpt on product hunt and you'll find a lot.