Two things I'm considering:
- Would the problem with this approach of ad-hoc GPUs likely be cold boot? It would take a shit ton of time to load. Though with data center networks speeds that wouldn't be too much of an issue - considering the fine-tuning itself would likely dwarf boot times.
- Is it possible to launch remote GPU instances ad-hoc from code? Is there a service that provides this service? Every time a call is made we'd spin up a GPU
Maybe the best approach for a V1 is to use the AWS SDK or something similar to just launch instances as calls come in.
Appreciate the help!
1. Kubeflow pipelines
2. Cloud Run using GPU instances
3. Knative training
4. Banana.dev for launching GPU bound stuff without much cruft