I'll start the recommendations with Karpathy's nanoGPT: https://github.com/karpathy/nanoGPT
What else do we have?
Please note that ChatGPT is not able to access past conversations to inform its responses."
https://help.openai.com/en/articles/6787051-does-chatgpt-rem...
Some interesting techniques I've seen involve essentially a ring-buffer and after each turn a call is made to summarize the conversation up to that point and use that as context for subsequent prompt
It's bigger brother, 1.3b, uses ~5.5gb of memory but yields slightly more GPT-like answers. Both take ~5-20 seconds to generate a response though, so take that into account when building with it.
However, you can fine tune it; and I'm sure with lots of fine tuning and some jiggling of the parameters you can get a half decent custom-purpose solution.
To run your own laptop and cheap we still need much fine tuned training set with much better algorithm .
Right now, most capable ones needs over 120gb of VRAM just inference (run).
We need the same for ChatGPT and GPT-4.
Any pointers as to where someone like me with software engineering experience but literally no AI knowledge can train my own GPT on my own data sets?
In my case I have downloaded some public domain databases (1-7gb each) and I would like to get some additional insights out of them. I have been querying them and using them to build my company but I’m curious to know if GPT can help me in that regard.
In general a lot of people download models from huggingface, I think that package automates that task.