Can you crowdfund the compute for GPT?
I'm just curious to know whether it's possible to crowdfund the compute costs for GPT models? It seems like this stuff is going to start to get beyond what any one individual can run, meaning it's in the hands of corporations or people with deep pockets. Can groups of people pool together money to run shared models? Because the alternative is that the companies just run away with the technology and leave the rest of us to wait for APIs or use whatever they give us.
I've often wondered why a service doesn't exist that allows you to rent out your graphics card for the large data processing needed for training models. Like mining bitcoin except you are doing something actually useful and getting paid actual money for it. Example:
- Company Alpha needs $40,000,000 worth of cloud computing for their training model
- Company Beta provides them said cloud computing for $30,000,000 from their pool of connected graphics cards
- Individuals can connect their computers to the Company Beta network and receive compensation for doing so. In total $20,000,000 is distributed.
Company Alpha gets their cloud computing done for cheap, Company Beta pockets the $10,000,000 difference for running a network, the individuals make money with their graphics cards, except this time it's actual United States Dollars. What am I missing here that would make this type of business unfeasible?
It's not that easy. Access to enough compute is one thing. However, you also need a proper dataset (beyond Common Crawl and Wikipedia), excellent research expertise and engineering capabilities. So even if you throw money or free credits for cloud compute out there it will not be enough. We've seen this happen with EleutherAI who were not capable of reaching their initial target of "replicating" GPT-3 and could only deliver the GPT-NeoX 20B model despite all the free compute etc.
I’d donate if the model and code became open source afterwards like stable diffusion.
Funny, I was asking myself the same question this morning, I was wondering if there was the equivalent of SETI@Home but to pool resources for model training.
Star Citizen raised half a billion dollars for a game that is never going to be released and all they did was promise a game in space with spaceships, sooo ...
Yeah, I think we could crowdfund a billion dollars for this, but we'd need some really competent people making sure it gets used optimally.
It's no different from any other type of server I think? Crowdfunding compute comes with a lot of security and privacy concerns and often if you're building a product putting that part of it in the hands of someone reliable is preferable.
It depends on what part of what you're building is your core intellectual property and such.
I'll ask back: what products would you build on top of a crowdfunded-compute GPT like model?
You might be interested in the LEAM.AI initiative, which is basically the EU planning to fund the creation of an Open Source GPT-3 competitor.
Their planning documents contain pages upon pages on all the related challenges, such as generating and storing the dataset, keeping your cluster running, tolerating node failures. In short: The compute time alone won't help you succeed.
But what about something for individuals?
You contribute n GPU minutes to the community adhoc pool, you get to use n minutes in parallel from said pool.
Some proof of work underlying blockchain.
Just for individuals, no promises of deep data / model privacy.
Has anyone implemented something like this?
Maybe someone can make a cryptocurrency where mining and "proof of work" is accomplished by submitting an adjustment to weights and biases such that it improves the test score on a public dataset.
What is your desired objective and what are you going to train on? There are plenty of publicly available model checkpoints so you don’t have to start from scratch with large GPU clusters. There’s no point in repeating the pretraining that has already been done.
I’m not sure that compute is still the bottleneck right now, it’s fairly cheap to train LLMs. Many optimizations like DeepSpeed dramatically reduce computation requirements / increase throughout (I.e. 13b model on a single GPU).
If we’ve learned anything from the massive LLMs like PaLM it’s that scaling autoregressive models to infinity has diminishing returns and is not feasible to implement in inference infra. Google themselves acknowledge this resource limitation in the Med-PaLM paper when they discuss fine tuning a 450b parameter model.
We’re really in more of a dataset and training task era of AI/NLP gains. Scaling masked the issues of poor quality training data and conventional language modelling up to a certain point but we’re starting to see the problems with that (hallucination) in all of the big (Galactica is an example of the limitations in this).
OpenAI’s main advantage is that they paid humans to build a large labelled dataset for their RL objective. They’re offering ChatGPT for free to get more training data.
Because GPT models will be made smaller eventually.
Like how people mirror the initial Stable Diffusion model, only to find better, faster and smaller new versions later.
Side hot take:
1. “AI” requires compute time for training (GPT, etc.)
2. If you use any “AI” service in the future, you could have to share idle computing power to improve that “AI.”
3. This may be tokenized via crypto. More contributions = more “AI” usage available for you.
Could the use of tokens or cryptocurrency incentivize participants to contribute their idle computing power to facilitate the process beyond those with deep pockets?
Distributed training could be done but how would data validation work? There would need to be some kind of voting per item as you need to assume adversarial input ( honestly a crypto token could work here for governance ). EleutherAI could be used to bootstrap but the hardware reqs for Neo-X are quite steep ( around 20 GB cuda memory per card , so you could only allow those who had a 3090+ and are skilled enough to install cuda 11 and do devops ). Im happy to contribute though!
I’d be more interested in closed co-ops where individuals provide computer power in exchange for access to the model. I think freeloaders will eat up compute time.
I guess I may be behind the times on some of these AI efforts, but I'm guessing that the model involves a big black box blob of data.
I'd be interested to know how models evolve with additional training, and wonder if there are additive identities that would allow two separately trained models to be combined.
It's no good for training, only inference, but there's salad.com that has 10k's of DAU with GPUs and they've built a managed container service, with a inference API coming soon https://salad.com/salad-inference-endpoints
The easy way to find out is to try...
You start a kickstarter, and when we get to our target of $10M, we can make a start on the computation!
I believe distributed compute *could* be one of the few applications where crypto might make sense? You could earn tokens by supplying compute and pay to get it, and the token price is a natural market price for compute.
Yes, you can. It's not rocket science - lambdalabs and preemptible instances on various public clouds are possible choices.
And you probably should.
If only crypto had started after gpt, then we would have had a clear goal of what to compute and how to incentivize the economics.
> Can groups of people pool together money to run shared models?
Then share the profit among the group of people? It’s called a company.
Cerebras will be the cheapest way, I think. Let's do the math.
What type of hardware is needed as the bare bones for this?
Salad.com is years ahead and ready for just this. :)
there were crypto projects like that, but I don't think any survived until today
Maybe a model like seti@home?
i will fund. worst case if it close-crowdfunded i will invest there too.
You can, Microsoft is a publicly traded company