HACKER Q&A
📣 DrNuke

Reinforcement learning for single, lower end graphic cards?


On one side, more and more hardware is being thrown in parallel to ingest and compute astonishing amounts of data generated by realistic 3d simulators, especially for robotics, with big names like OpenAI now just giving up on the field as from https://news.ycombinator.com/item?id=27861201 ; on the other side, more recent simulators like Brax from Google https://ai.googleblog.com/2021/07/speeding-up-reinforcement-learning-with.html are aiming at “matching the performance of a large compute cluster with just a single TPU or GPU”. Where do we stand on the latter side of the equation then? What is the state of the art with single, lower end GPUs like my 2016 gaming laptop’s GTX 1070 8GB? What do we lower end users need to read, learn and test these days? Thanks.


  👤 pepemysurprised Accepted Answer ✓
For many RL problems you don't really need GPUs because the networks used are relatively simple compared to supservised learning, and most simulations are CPU-bound. Many RL problems are constrained by data so that running simulations (CPU) is the bottleneck, not the network.

👤 sxp
Instead of using your low end GPU, you could get a TPU like https://coral.ai/docs/edgetpu/benchmarks/. Or rent a single GPU on the cloud which costs less than a $/hour and can be free in certain cases.

In terms of APIs, you can try WebGPU which is nominally meant for Javascript in the browser, but there are native interfaces for it such as Rust: https://github.com/gfx-rs/wgpu


👤 bArray
This is mostly in the realm of computer vision, but I would recommend checking out AlexeyAB's fork of Darknet: https://github.com/AlexeyAB/darknet It's got decent CUDA acceleration, I personally run a GTX 960M for training.

👤 artifact_44
Check out Andrej karpathys convnet.js and deepq learning web apps.

👤 mate543
Not answering the question directly but you could use a free gpu from colab https://colab.research.google.com/github/tensorflow/docs/blo... . Note that you need to be backing the checkpoints if you intend to run for more than a couple of hours.

👤 bionhoward
GPU memory is a key limit here and RL worsens the problem because it requires some eligibility trace or memory system. One option you could try would be to store past S,A,R,S' tuples on disc using something like DeepMind Reverb, and have a small batch size and simpler model. Or as mentioned in other comments, you can just use CPU and RAM which is often a lot higher capacity depending on your rig.

👤 vpj
Depends on what sort of RL you are doing. If you are trying to train agents to play small games with vision the agent will need a small cnn to process images. This will need a gpu and what you have should be enough.

I was training on atari for a while with 1080ti. The games run on the cpu so you need a decent cpu as well.