Is building a global cache of all computations feasible?

Question

Thinking about the high cost of training large AI models, and how much duplication in computation there must between such models when trained by company A vs company B (say, training a large language model on the whole internet), I've been wondering if it makes sense for humanity to create a central, global database that stores every computation performed by any connected computer (utilizing either CPU or GPU). This would record the inputs of these computations at a low level (in machine code, for instance) and their outputs (like the result of a summation operation). It would be continually updated with new computations over time.The primary function of this database would be to act as a sort of "global cache." When a member computer is about to perform a computation, it would first check this database. If the computation has already been done, the computer would simply fetch the pre-computed result instead of redoing the computation. The underlying goal is to save on compute resources globally.N.B. this does not mean we precompute anything necessarily, but we do store everything we have computed thus far. The hit rate on the cache might be very low for a while, but one would think it'd eventually go up. The way we're going about this (throwing more GPUs at it) just seems awfully wasteful to me.Has anyone thought about/done any research on this?

gitgud · Accepted Answer

> &ldquo;When a member computer is about to perform a computation, it would first check this database. If the computation has already been done, the computer would simply fetch the pre-computed result instead of redoing the computation. The underlying goal is to save on compute resources globally.&rdquo;This idea glosses over the engineering complexity of &ldquo;searching&rdquo; the cache, which sounds like it will grow to include every possible computation ever.The reason it&rsquo;s not feasible is the same reason computers can&rsquo;t just have an huge L1 cache, instead of a hard drive. There&rsquo;s physical limitations of materials when retrieving and searching the cache. So just performing the computation is often quicker.However&hellip; your suggestion would be suitable for functional programming. Pure functions should always return the same result, so caching the result of CPU intensive functions makes a lot of sense&hellip; which is what [1] bazel&rsquo;s remote cache does. But most software does not use pure functions&hellip;Also, there is another interesting question that comes to mind. What if &ldquo;quantum computing&rdquo; could allow us to do &ldquo;branch prediction&rdquo; of computation at an incredible scale?[1] https://bazel.build/remote/caching

latexr · Answer

That sounds neither possible nor desirable. Just think of all the energy and time proof of work cryptocurrencies need. You want to expand that to unthinkable levels. Let&rsquo;s say you have a list of fifty names, first and last, and want to get only the last names. You compute one result and store it. If you change a single letter, the first computation is of no use and you need to redo it all and waste more space. Most operations would be slower, even, because now you need to wait for a network request for every little thing, wasting resources looking in a database, and the overwhelming majority will return nothing. And think of how many powerful computers you&rsquo;d need to run such a service all the time for everyone.

smoldesu · Answer

No; the computational cost of storing and navigating hashes will eventually exceed the cost of just running it locally. Even assuming that most of the work is done server side, the power (and latency) cost of making a POST request would probably exceed most basic calculations. My intuition says this would fail on a lot of levels.One immediate problem is that you have to map the complexity space of an infinite number of inputs and outputs to even meaningfully store a signature of the computation. The complexity of the input is probably larger than the output, in most cases. That makes "searching" for a solution almost pointless out of the gate.

quickthrower2 · Answer

People are saying no, but a lite version of this is already done. It is called foundational models. A foundational model such as Llama2 is then fine tuned at a much lower cost by various people either for research or production. Caches and checkpoints will always be useful. But you are not saving every computation.