We’ve trained a GPT-2 1.5B model on chess PGN notation. Surprisingly, it’s not bad after only a day of training: https://lichess.org/UMyang4z
(Or rather, it’s not bad up until midgame, at which point it usually blunders. We think it’s because it’s “playing blindfolded” due to the fact that it’s trained solely on PGN notation, as opposed to encoding the full board state each move.)
We’d love to release a Colab demo similar to AI dungeon. But as with AI dungeon, our model is 5.6GB. Downloading from a GCE bucket would cost $0.056 per click, if I understand the outgoing bandwidth pricing model.
Our options seem to be:
1. Download the model via BitTorrent in a colab notebook
2. set up a server to power the demo rather than distribute the model to every client
3. find a host with low bandwidth fees, and write the notebook to download from that
All three have tradeoffs, but #3 seems simplest. Anyone know of a way to distribute 5.6GB to ~500k people for less than a few hundred dollars? BitTorrent might be fine if it can deliver the entire model in less than a couple minutes (otherwise people will get bored and leave).
How many playing sessions a server with a single 2080Ti can support? Is it compute or memory bound? I'd plot num_sessions vs latency (time to compute a move), and estimate the costs for target scale/performance.