I personally had the best experience with GPT4-x-vicuna: https://huggingface.co/NousResearch/gpt4-x-vicuna-13b.
There's more variants, and you can find information on them on https://www.reddit.com/r/LocalLLaMA.
The best way for you to run the model is probably through https://github.com/ggerganov/llama.cpp. This is a plain C/C++ implementation that can run LLMs pretty efficiently. It can run the LLaMA 13B variants at a pretty quick pace (~100 ms/token) on my M1 Pro macbook.
I'd be happy to answer more questions.
I hope it will be useful.
https://huggingface.co/TheBloke/Manticore-13B-GGML
You can download it directly from oobabooga interface
overall i have found gptx-4-alpaca to be pretty good.
None of them are comparable to chatGPT of course because they are smaller, but they are still great
https://chat.lmsys.org/?arena (Then click Leaderboard)
Perhaps llama considering it can be self hosted?