HACKER Q&A
📣 yummytummy

Who is working on an unrestricted LLM?


After using ChatGPT for various purposes over the past few weeks, I believe that this type of technology will have a significant and wide-ranging impact on society.

It has also been alarming to see how OpenAI has implemented more and more restrictions to prevent what they consider "harmful" content.

We know that these models are capable of generating such content, as we have seen how prompts that were readily accepted a few weeks ago are now being outright rejected. Since these models are currently very expensive to train, only companies with significant financial resources have access to their full capabilities, and they get to set the restrictions for everyone else.

Normally, I would be okay with this, but this type of technology seems different. If this is going to be a primary (or even secondary) interface for people all around the world to access information and create content, this type of censorship and power imbalance seems dystopian.

It is impossible for these companies to accurately determine whether a specific piece of content is harmful or not, as it depends solely on how it is used. It appears more like a convenient excuse to promote the commercial/ideological interests of the model owner. Some people claim there is no "moat" here, but could this be it?

It seems the only way to combat this is to have a publicly available, trained model that is on the same level, but free from artificial restrictions.

Who is working on this, and how can non-experts contribute?


  👤 mindcrime Accepted Answer ✓
There are a handful of "open source LLM" initiatives out there, although I don't think any of them are quite up to the level of ChatGPT. Possibly one of the more interesting ones is GLM-130B.

https://github.com/THUDM/GLM-130B

Released by some folks at Tsinghua University in China, back in August. The model itself is licensed under some janky "free to use, but not open source" license, but it looks like most of the code for training, evaluation, etc. is available and licensed under either the Apache License or a BSD-like license.

You might also find this of interest:

https://arxiv.org/pdf/2103.08894 - "Distributed Deep Learning Using Volunteer Computing-Like Paradigm"

FWIW, I tend to agree with your overall sentiment. As AI becomes progressively more capable, it represents an ever increasing possibility of consolidating more and more power into the hands of fewer and fewer entities. I believe that one way to counter that (albeit not one without its own risks) is to democratize access to AI as much as possible.

Actually, now that I think about it, wasn't something along those lines purportedly the original idea behind OpenAI in the first place? Or am I having a "Mandela Moment" and mis-remembering?


👤 RGamma
A problem with your post is that its interpretation is highly subjective.

Can you name some examples of content that was restricted on grounds of harmfulness that you want to see unrestricted?

What could be possible consequences of lack of content moderation? Consider that not every actor is well-meaning or responsible and not every recipient is able to evaluate or contextualise information very well (or self-assess their (in)competence to do so).