HACKER Q&A
📣 LewisDavidson

Open source LLM for commercial use?


Working on a ML project and looking for an open source LLM that can be used in a commercial environment. As far as I'm aware, products cannot be built on LLAMA.

I don't want to use GPT since the project will be using personal information to train/fine tune the models.


  👤 Garcia98 Accepted Answer ✓
I've seen this question asked repeatedly in many LLaMa threads, currently the best models that are truly open are the released models from the Flan family by Google, which includes Flan-T5[0] and Flan-UL2[1]. According to its paper, Flan-UL2 performs slightly better than Flan-T5-XXL.

These models perform slightly better than GPT-3 under some tasks[2], but they're still far from achieving the results from GPT-3.5 and GPT-4. This becomes evident when you try to use them in the real world; they're not "good enough" for general use cases, unlike ChatGPT models. However, if you can restrict your use case to one particular domain, you can achieve pretty good results by further fine-tuning these models.

[0]: https://huggingface.co/google/flan-t5-xxl

[1]: https://huggingface.co/google/flan-ul2

[2]: https://paperswithcode.com/sota/multi-task-language-understa...


👤 lhl
The ones I saw mentioned so far were Flan, Cerebras, GPT-J, and RWKV.

Not yet mentioned:

* Pythia https://github.com/EleutherAI/pythia

* GLM-130B https://github.com/THUDM/GLM-130B - see also ChatGLM-6B https://github.com/THUDM/ChatGLM-6B

* GPT-NeoX-20B https://huggingface.co/EleutherAI/gpt-neox-20b

* GeoV-9B https://github.com/geov-ai/geov

* BLOOM https://huggingface.co/bigscience/bloom and BLOOMZ https://huggingface.co/bigscience/bloomz


👤 icapybara
Others have answered your question, but I'll add that the market for high quality AI models is not similar to the software marketplace, where there is always an open source alternative (and where open source is often the state of the art).

LLMs take so much engineering effort, research, and compute that it's unlikely there will be good open source alternatives in the near future. Right now your only real option is OpenAI (or maybe Anthropic) and that seems unlikely to change anytime soon.

The only reason we have LLAMA is because Meta threw us a bone. They might not do that again.


👤 dtagames
I think you might be confusing the GPT software (a generative pre trained transformer) with the finished product, an LLM (large language model.)

A GPT has no training until you give it materials. I do believe Google released the code for theirs ages ago. Even without source, you can run a GPT against your own data locally, or on a cloud service setup for that purpose.

This is how Bloomberg, for example, created a financial LLM. They used a GPT to train on their own financial data.


👤 dmurko
Just in case you were not aware: "OpenAI does not use data submitted by customers via our API to train OpenAI models or improve OpenAI’s service offering." It does for ChatGPT though.

Source: https://help.openai.com/en/articles/5722486-how-your-data-is...


👤 cl42
Dolly 2 was released today and is OK for commercial use: https://huggingface.co/databricks/dolly-v2-12b

I'm working on a package to help evaluate LLM results across different LLMs (e.g., GPT3.5 vs. GPT4 vs. Dolly 2 vs...); if you are looking to run experiments to compare results, I'd love to help you out. You can email me at w (at) phaseai (dot) com.


👤 titaniumtown
Cerebras-GPT is licensed under Apache-2.0 and permits commercial use

https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-...


👤 sinenomine
If you want quality, use Google's Apache-licensed LLM https://huggingface.co/google/ul2

👤 gumby
> looking for an open source LLM that can be used in a commercial environment. As far as I'm aware, products cannot be built on LLAMA.

Commercial product sure can be built on top of LLAMA, it's GPL-3. Your models are your own; just patches, modifications, and code you link to LLMA itself will be governed by the GPL as well.

This is almost certainly what you want since this way you can use patches, fixes, and improvements others make to LLMA. You won't have to do all that work yourself, or necessarily wait for Facebook.


👤 erwincoumans
Truly Open AI: LAION calls for a supercomputer to develop open-source AI, by replicating large models like GPT-4 and exploring them together as a research community.

https://www.heise.de/news/Open-source-AI-LAION-proposes-to-o...



👤 K0IN
I think https://github.com/BlinkDL/RWKV-LM could be used, but not all versions (namely instruction fine-tuned models trained on alpaca data)

👤 mingyeow
Noob question here - what’s the best tutorials to get started in mixing LLM models and building on top of one another, assuming very good programming background but little AI background? I asked chatGPT this question, and it was helpful but not comprehensive, but I figure intelligent humans on this forum will give the best answers.

👤 dreaminvm
Here's a recent release of fine-tuning Flan-UL2 on instructions (alpaca). https://medium.com/vmware-data-ml-blog/lora-finetunning-of-u...

👤 brentis
My personal use case is that I'd like to query a bunch of our APIs and amalgamate a response those consumable for humans.

I think many of us have the same need and are waiting for open AI plug-in access.

Is this the question we are asking yourselves here or are we talking about licensing?


👤 wejick
I remember someone mentioned on other thread that after distilled, llama will have no license issue. can someone explain why is that the case?

Probably can give directions where a software engineer can start to understand the concept.


👤 rolisz
What exactly do you want to do? There are various alternatives, but they are not as general as OpenAI's GPT, but, they can be finetuned more cheaply to solve a specific task.


👤 redskyluan
what about the https://huggingface.co/facebook/opt-66b?

I thought the opt series can be used in production


👤 maxilevi