Share Embedded GPT Model Without Going Broke?

Question

Hey,I'm looking to share an embedded GPT Model that I've been working on after I was kind of inspired by the Paul Graham Model(https://paul-graham-gpt.vercel.app/) by @mckaywrigley. The problem is that every API call costs money and @mckaywrigley solved it by asking for people's OpenAI API Key to talk to the model, which a lot of non-technical people don't get. Also, I know you can put a limit on your OpenAI Account so your bill isn't astronomical the next day but that kind of defeats the purpose of sharing it online. Hopefully, there's a solution but if not I'll just ask for API Keys too.

MacsHeadroom · Accepted Answer

Use GPT-NeoX-20B instead for 100x cheaper cost via https://nlpcloud.com/ or another provider. It's $0.00004 per token. They even let you fine tune your models and then charge $0.00007 per token for using a fine tuned model.
Basically anything is cheaper than using OpenAI, since their models are extremely inefficient. The latest Meta LLaMA-13B outperforms GPT-3 175B on a single 24GB video card. LLaMA-7B outperforms all other open source models and runs on a single 10GB video card. https://rentry.org/llama-tard
For more options check out the Chatbot General thread on /g/ https://boards.4channel.org/g/thread/91897528

brucethemoose2 · Answer

Maybe ask for keys and then fall back to your key?The community really needs to move to an LLM Stable Horde analogue, using Facebook's leaked LLM running on a network of volunteer community GPUs. OpenAI's stranglehold is already becoming a problem, and the image generation community has more-or-less dumped them.