What ML platform are you using?

Question

I am interested to know what ML platforms you use for personal/hobbyist projects... Do you rent GPU instances from the likes of Azure, GCP, AWS or do you use managed solutions like Paperspace Gradient, or Colab? why or not?I am very much a beginner in the space of machine learning and have been overwhelmed by the choices available. Eventually I do want to simply want to build my own rig and just train models on that, but I don't have that kind of money right now, nor is it easy to find GPUs even if could afford them.So I am basically stuck to cloud solutions for now, which is why I want to hear personal experiences of HN folks who have used any of the available ML platforms. Their benefits, short comings, which are more beginner friendly, cost effective, etcI am also not opposed to configuring environments myself rather than using managed solutions (such as Gradient) if it is more cost effective to do so, or affords better reliability // better than average resource availability... because I read some complaints that Colab has poor GPU availability since shared among subscribers, and that the more you use it the less time is allocated to you... not sure how big of a problem it actually is though.I am very motivated to delve into this space (it's been on my mind a while) and I want to do it right, which is why I am asking for personal experiences on this forum given that there is a very healthy mix of technology hobbyists as well as professionals on HN, of which the opinion of both is equally valuable to me for different reasons.Also please feel free to include any unsolicited advice such as learning resources, anecdotes, etc,Thanks for reading until the end.

mdp2021 · Accepted Answer

> I am very much a beginner in the space of machine learningWhile the (precious and useful) advice around seem to cover mostly the bigger infrastructures, please note thatyou can effectively do an important slice of machine learning work (study, personal research) with just a battery-efficiency-level CPU (not GPU), in the order of minutes, on a battery. That comes before going to "Big Data".And there are lightweight tools: I am current enamoured with Genann (&laquo;minimal, well-tested open-source library implementing feedfordward artificial neural networks (ANN) in C&raquo;, by Lewis Van Winkle), a single C file of 400 lines compiling to a 40kb object, yet well sufficient to solve a number of the problems you may meet.https://codeplea.com/genann // https://github.com/codeplea/genannAfter all, is it a good idea to use tools that automate process optimization while you are learning the deal? Only partially. You should build - in general and even metaphorically - the legitimacy of your Python ops on a good C ground.And: note that you can also build ANNs in R (and other math or stats environments). If needed or comfortable...Also note - reminder - that the MIT lessons of Prof. Patrick Winston for the Artificial Intelligence course (classical AI with a few lessons on ANNs) are freely available. That covers the grounds before a climb into the newer techniques.

cweill · Answer

I worked in Google Research for over 5 years doing Machine Learning, and recently quit to build my own ML start-up. These days, I solve a mix of NLP, computer vision, and tabular problems, all with state of the art neural network techniques. I've tried many setups.
My advice is go with Colab Pro ($50/mo) and TensorFlow/Keras. You can go with Pytorch too if you prefer.
I made the mistake of buying a 2080Ti for my desktop thinking it would be better, but no. Consumer grade hardware is nowhere near as good/fast as the server grade hardware you get in Colab. Plus you have the option to use TPUs in Colab if you want to scale up quickly.
You really don't need to get fancy with this setup. The best part of using Colab is you can work on your laptop from anywhere, and never worry about your ML model hogging all your RAM (and swap) or compute and slowing your local machine down. Trust me, this sucks when it happens, and you have to restart!
As for your data, you can host it in a GCS bucket. For small data (<1TB) even better is Google drive (I know, crazy). Colab can mount your Google drive and loads from it extremely quickly. It's like having a remote filesystem, except with a handy UI and collaboration options, and an easy way to inspect and edit your data.

lumost · Answer

Honestly, I've found that most ML tooling is overly complicated for most ML projects.
I use a paperspace VM + Parsec for personal ML projects. Whenever I've done the math an hourly rate on a standard VM w/GPU is better than purchasing a local machine and the complexity of a workflow management tool for ML just isn't worth it unless you are collaborating across many researchers. As an added bonus, you can re-use these VMs for any hobby gaming you might do.
The majority of ML methods train quickly on a single large modern GPU for typical academic datasets. The scaling beyond 1 GPU or 1 host leads to big model research. While big models are a hot field, this is where you would need large institutional support to do anything interesting. A model isn't big unless it's > 30 GB these days :)
Even in a typical industrial setting, you'll find the majority of scientists using various python scripts to train and preprocess data on a single server. Data wrangling is the main component which requires large compute clusters.

jdeaton · Answer

I put together a linux box with a 2080ti a few years ago and have been using it consistently for personal ml research ever since. Ive found it well worth the investment and learned that the ease with which I can jump into hacking on a project is key, which is why this works so well for me. I can just ssh in at any time and start experimenting with models. Even if its not technically economical when you do the math, the ease, reliability and fact that I know my models arent being billed per hour helps encourage me to experiment often, which is kwy to learning.As for software, I do everything with jax and tensorboard for viewing experiments. Jax is a phenomenal library for personal ml learning as its extremely flexible and has relatively low level composable abstractions.

timzaman · Answer

Get a decent NVIDIA GPU. Then install PyTorch and off you go. I advise to make all your own tooling, as you likely have a specific use-case, and so your tooling can be tailored to that. Most ML tools are very generic, or so simply you might as well do them yourself. The advantage if having your own box, is that (1) you'll learn some systems skills building it and (2) since you invested in it, you should feel obligated to use it! Good luck.

mark_l_watson · Answer

It is very different when you are paying for DL compute yourself, not as part of a job. I have mostly worked in DL for 7 years, but I also have your use case of running my own experiments and simply wanting to learn new things on my own time.
I am biased towards using Keras and I suggest you bookmark these curated examples https://keras.io/examples/
I bought an at home GPU rig 3 years ago and I regret that decision. As many other people here have mentioned Google Colab is a great resource and will save you so much time because you will not be setting up your infrastructure. Start with the free version and when you really need to, switch to Pro or Pro+.
For more flexibility, set up a GPU VPS instance that you can stop when not in use to save money. I like GCP and AWS, but I used to use Azure and that is also a great service. When a VPS is in a stopped state, you only pay a little money for storage. I will sometimes go weeks without starting up my GPU VPS to run an experiment. Stick with Colab when it is good enough for what you are doing.
Now for a little off topic tangent: be aware that most knowledge work is in the process of being automated. Don’t be disappointed if things you spend time learning get automated away. Look at the value of studying new tech as being very transitory, and you will always be in the mode you are in right now: a good desire to learn new things. Also, think of deep learning in the context of using it for paid work to solve real problems. As soon as you feel ready, start interviewing for an entry level deep learning or machine learning job.

crimsoneer · Answer

Honestly, if you're a beginner in the machine learning space, you're not going to need GPUs for a LONG time, and would benefit from learning what's going on under the hood. Install Python on your machine, learn to structure your projects well, environments and requirements, etc. if and when you need more, figure it out then.

modeless · Answer

I think Colab is very popular since it's free. Should be perfect for a beginner who doesn't want to spend money. I don't think there's a lot of lock-in so just try it and see. There are bigger questions to worry about, like should you use TensorFlow (no) or PyTorch (yes) or JAX (maybe). That's much harder to change later.

rg111 · Answer

Thr best option for you is- Gradient Paperspace and Colab. Both are free and managed.
Learn Machine Learning first. Do not spend time on managing infra for ML while you are learning ML. Focus on learning ML first.
You can make decent cutting edge models and SOTA classic models just with free options. I am saying this because I have done this.
I suggest that you get Colab Pro after that.
AWS burns a hole on your pocket, and you should not spend money on that now. Although, AWS SageMaker is pretty tension-free experience.
I personally use GCP. I like the tooling around it to be the most convenient.
I suggest you learn the basics first. Learn classic ML, CNNs, RNNs, LSTM, Transformers, learn the necessary Maths, and even GANs if you are inclined.
If done in the right way, it will take you a 5/6 months to 18/20 months, depending on your time commitment, your current levels of grasp on programming and Math.
Do not rush or hurry.
When you reach that point, you can think of spending serious money for Deep Learning projects.
A few months back, I have gotten into TPUs, and these are fantastic. And GCP is my only option for these. I have only used TPUs for learning and personal project and never for work. I intend to keep it that way for a while.

rainboiboi · Answer

I created something that lets you get free GPU on VS Code with Google Colab with just 1-click. Have a look at https://github.com/DerekChia/colab-vscodeThis is my default go-to as a poor man ML setup, with environment and dependencies set up automatically via bash script on start up.

jw4027 · Answer

As some others have said, using a low power PC without an accelerator is a perfectly good place to start. This will get you 70% of the way there.
In terms of framework,, Pytorch seems to be better documented than Tensorflow and supports a more intuitive model for GPU/TPU compute in my opinion. It also natively supports complex number types when backpropagating so no need to implement your own. It also seems like Tensorflow has issues converting python code to the graph where Pytorch basically never has issues. It can take me 1/3 less time to program using Pytorch because of this. If you are using high-level interfaces, this shouldn't be an issues though.
Colab (and I believe Sagemaker) has free instances which have high power GPUs/TPUs. However, I prefer having access to a good graphical debugger so I develop on my local computer, then run large models on Colab. If you can afford it, I'd recommend a cheap, low power Cuda capable GPU for your local computer to develop the network, then use an IPython based cloud solution when memory/computer becomes limiting. They are also a fine place to start out. It's just having a graphical debugger can make you more productive.

Simon_O_Rourke · Answer

Working with a large well known tech company, with surprisingly basic/non-existent ML until only very recently.
Using Redshift to do a lot of the heavy lifting and initial data preparation, then SageMaker for hosting models and scoring, and Tableau for dashboards.
While you can do training within SageMaker, we have a cluster of EC2 instances using H2O libraries (xgboost) to train, then wrap the resulting model as a docker image and deploy it to ECR and link to a SageMaker endpoint.
Clunky and very much human-in-the-loop for training and deployment, but you can't run before you can crawl in this space.

manca · Answer

Are you just interested in the training part and managing the trained models, or you'd actually like to productionize the models and serve them at scale?
A lot of end-to-end platforms are available nowadays that try to cover the entire lifecycle of a model from data prep, ETL, to training, serving, monitoring, operating. However, I found none of them really robust enough to cover all these cases perfectly, so I resorted to using different pieces from different vendors combined with my own stuff to make the entire platform suit my needs. This is still not perfect, though, and I think there's a lot of room for improvement in the space to enable really easy to use and scalable MLOps.
Still some of the tools I found to be ok: TensorFlow TFX, Kubeflow (to some extent - ops are a nightmare), Feast, MLFlow, GCP Vertex and AWS Sagemaker can get some work done, too.

sandruso · Answer

I have been working last 1.5 years on my master thesis and I my setup evolved Colab -> Kaggle -> Azure ML.
Colab you is great for diving into examples that are already premade for colab.
Kaggle is better in my opinion in dataset handling, you can import public or upload your dataset with ease. They give you 30+ gpu hours for a week with ability to train your models in background. This can’t be done in Colab.
ML Azure platform is next level when you can pay for it. I’ve got credits from school. You can start experiments from python sdk with your own configurations, setup python environments, upload datasets, etc.

AitchEmArsey · Answer

Try it out on a local Linux machine first, if you have one. There are plenty of ML techniques outside of neural networks which train perfectly well on a CPU, so I'd start there.Look at some kind of AutoML framework like AutoGluon, then dive deeper on the components it uses once you've got through the initial setup process. AutoGluon will let you train some basic models with all the data cleaning and normalisation steps handled for you.

ShamelessC · Answer

> I am also not opposed to configuring environments myself rather than using managed solutions
vast.ai has pretty low prices and gives you remote ssh into a GPU instance that you then have root on (albeit containerized).
Having a local GPU is effectively a requirement for doing "development" work (e.g. getting an architecture and/or codebase to the point where you would even be able to start training). Unfortunately, getting your own GPU is just absurdly expensive these days and probably not worth it. In the meantime, colab/kaggle/paperspace can be _okay_ as dev environments. Unfortunately, renting compute on vast.ai all day just to do occasional dev work gets expensive pretty quickly.
For something in-between vast.ai and AWS, datacrunch.io has slightly higher prices, with remote SSH into a server and a few more "niceties" that you get with a traditional cloud such as CPU instances and the ability to use those to pre-load data onto disk.
If and when you are able to get a GPU - just make sure to get nvidia as they have a stranglehold over the industry. The RTX cards are great - I've been doing tons of multimodal work on an RTX 2070 I bought pre-pandemic for around 350$. It only has 8 GiB of vram but is actually quite similar to a server-style V100 otherwise. I assume it probably costs like 2000$ these days.
If you're interested in the realm of running inference/training on giant models (say GPT-J 20B), you may find yourself in lack of VRAM. Using libraries like deepspeed, you can split the work across multiple GPU's. I highly recommend investing time to learning multi-GPU libraries or framework-provided features like pytorch's distributed data parallel as the size of models becomes a limiting factor very quickly in the case of transformers. A sibling comment mentions that you will need institutional support for training such models. This may be true, unfortunately. All I will say is that if you are even mildly competent, the demand for that type of work is increasing a lot lately.
Oh and yes, there is a new-ish site called replicate that I have been using to allow people to run inference on models that I've trained https://replicate.com/ without needing to be a coder. A lot of people use colab for this but that platform is annoying to support in practice.

mrintellectual · Answer

I like Colab for the most part since I'm biased towards Python, but being centered around Jupyter notebooks does have its shortcomings. Also, despite being a service offered by Google, I prefer PyTorch over Tensorflow.For smaller projects, I generally find a Towhee pipeline (https://towhee.io/pipelines) that I then fine-tune on my 3080.

tolstoyevsky · Answer

I haven't experienced any real issues with GPU availability on Colab, I suggest that you just go ahead and use it and wait with the premature optimization until you actually hit a wall and need it.
For general advice focused on beginners and ESPECIALLY practical, cheap and efficient methods and hacks to do DL, I recommend searching in https://www.fast.ai/ and their forums https://forums.fast.ai/
I'll try to search inside fast.ai if there is a more specific link to give. I know that one of their chief pieces of advice has been to use Colab and take advantage of the 300$ free credit you get (per credit card) when signing up to Google Cloud, which you can use for DL.
Disclaimer - I'm one of the creators of DagsHub, we created the platform especially to help people like you with the difficulties of managing things like data and model versioning, experiment tracking, labeling, etc. we'd love to have you onboard, and thanks for reading until the end :)

laingc · Answer

We use Kubeflow at our shop. If you use a managed K8s offering, it&rsquo;s quite simple to manage, and of course you can deploy all your other stuff alongside it using the same tool stack.

vukadinovic · Answer

I have been using Google Colab and AWS for 'big' personal projects while I use my PC to train light models. Colab is neat because it's free and I like the style of their notebooks. However, you cannot always find available GPU and I wouldn't recommend free Colab for anything else than learning and experimenting with ML. I have been using AWS Ubuntu server set up with PyTorch and had OK experience, but you need to be careful about pricing and set up policies, as well as remember to turn off your machines when you're not working on them if you don't want your credit card to blow up. In the future, I might give Google Colab Pro a try, but most of the work now I do on the company's server.Anecdote: When I was taking the 'Computing For Data Science' class, we had a task to learn to use AWS tools like SageMaker, NLP bot or DeepRacer and present it in the class. The professor was also new to the whole AWS ecosystem. He opened many instances and left them running for a week which ended up taking 1000$ from bank account.. (Moral of the story: don't use aws with the card where all your money is)

JoelMcCracken · Answer

Personally I stick with the classic SMLNJ

jw4027 · Answer

As some others have said, using a low power PC without an accelerator is a perfectly good place to start.
Pytorch seems to be better documented than Tensorflow and supports a more intuitive way to use the GPU/TPU in my opinion. It also natively supports complex number types when backpropagating so no need to implement your own. It also seems like Tensorflow has issues converting python code to the graph where Pytorch basically never has issues. If you are using high-level interfaces, this shouldn't be an issues though.
Colab (and I believe Sagemaker) has free instances which have high power GPUs/TPUs. However, I prefer having access to a good graphical debugger so I develop on my local computer, then run large models on Colab. If you can afford it, I'd recommend a cheap, low power Cuda capable GPU for your local computer to develop the network, then use the IPython, cloud based solutions when memory/computer becomes limiting. They are also a fine place to start out. It's just having a graphical debugger can make you more productive.

ugoel · Answer

I have been building Deploifai for a year. I built it for myself early on because I wanted to train machine learning models on the cloud since we don't have the resources for a physical machine. I basically wanted to use my AWS account to create VMs with environments pre-configured, and just simply start building my ML models. Deploifai sets up the VM with pre-selected ML framework, NVIDIA drivers and Jupyterlab. It takes about 15mins to set up but it eventually ends up saving me quite a bit of time.
You can give it a try as well: https://deploif.ai (It says paid on the website, but just get on our Discord and message me). The platform now supports GCP and Azure as well. I am happy to guide you through as well. It's not complete, but in case you choose to go ahead with cloud, this could help you out :)
We'd also be happy to have someone try the tool!

version_five · Answer

For learning (and for development on most projects, until it actually comes time to train the real model) the K80 or whatever the lower tier is on colab is fine as a gpu.
The problem with colab IMO is that if it's your main platform, you'll be pushed to use notebooks for everything which is not really a good practice. Whatever you use, I'd suggest focusing on building a real train.py script (I'm assuming you'll be using python) that takes command line arguments for the hyperparameters. Don't get sloppy and just have things run as a bunch of cells.
If you are learning, my unsolicited advice is don't use built in datasets, make sure you can write datasets / dataloaders yourself so you understand what is going on and can adapt to your own work. All the stock examples using built in mnist or whatever gloss over the most important parts of setting up the data

edublancas · Answer

One of my deal breakers when choosing tooling is how easy is to move from a local environment to a distributed environment. Ideally, you want to start locally and move to a distributed env if you need to. So choose one tool that allows you to get started quickly and move from there.
As an example: one of the reasons why I don't use Kubeflow is because it requires having a Kubernetes cluster up and running, which is an overkill in many cases.
Check out the project I'm working on: https://github.com/ploomber/ploomber

sbrother · Answer

I bought a 4x 2080Ti box from Lambda Labs a few years ago and am very happy with the purchase. It's powerful enough to train reasonably sized models, and I charge clients for the compute time at a rate slightly lower than what AWS or GCP would charge. I also mine ETH on it during downtime, and it's already more than paid itself off (I get "free" electricity from my commercial landlord.)Finances aside, it's really nice being able to iterate locally on things like training/inference pipelines and model serving. My work is more toward the ML engineering space than it is research, so I don't spend much time in Colab.

apohn · Answer

For personal use I use my laptop for anything ML and Colab Pro for anything Deep Learning. The free edition of Colab is also great for learning Deep Learning.
I personally find most cloud providers annoying to use for personal use. You have to ask for permission to get access to a GPU that's not any better than what you get in the free with Colab. Then, there's all sorts of configuration you have to do. Colab is much easier and basically zero wait time to go from logging in to starting to run code.
At work we use Databricks, which is too expensive for personal use.

ricklamers · Answer

I see some great recommendations in the thread already, but I think https://cocalc.com is definitely worth checking out if you consider yourself to still be more of a learner. Their focus seems to really be on helping people who are new to the field get started. It offers familiar Jupyter Notebook-like features so you should feel right at home.I have no affiliation with them whatsoever :) Just a fan of what they&rsquo;re doing.

guard0g · Answer

Colab is a cheap (free) way to start, though you won't be training very large models for very long (which you shouldn't be doing if you are a beginner). You learn a different set of skills when you put together your own rig and install/maintain the libraries, which is something I recommend everyone to try just go gain an appreciation of devops skills. But that's not a necessary diversion for a beginner (and may needlessly increase the learning curve).

javierluraschi · Answer

If you want to build a web application on top of your ML project, give https://hal9.com a shot. We designed Hal9 with ease of use for deployment and maximum compatibility with web technologies that enable you to build ML apps with React, Vue, etc. We launched a couple months ago but could use some early feedback and users. Thank you!

totony · Answer

You can run on-demand training jobs on GCP vertex ai training. I'm not sure the price point but it's pretty useful both for development and training.

xoserr · Answer

I think you have to figure out what kind of problems you want to solve. scikit-learn will run perfectly fine on your CPU and you might not need all this complexity beyond that.I think you are getting side tracked by a bunch of people at a car show with their hood popped checking out the custom chrome engines each other have. It is a bit pointless to worry about if you don't even know how to drive yet.

fxtentacle · Answer

I recommend to use Colab for learning because so many research papers publish their own examples as a Colab link nowadays, so you'll have plenty of stuff to try out and explore.For the actual deployment in production, the only thing that's really affordable is if you send your own GPU workstations to a colocation hosting company. But that's a lot of work.

accraze · Answer

For inference, we extended KServe (previously KFServing from Kubeflow) to fit our on-prem cluster needs. Highly recommended!

emmelaich · Answer

BTW this question probably suits a poll.https://news.ycombinator.com/newpoll

iforgotemailpwd · Answer

At work, I use an Azure Virtual Machine that I scale up/down according to needs. At home, I have a desktop with a GPU, running linux.

p1esk · Answer

What do you want to train and why?

qrian · Answer

Colab (Pro+?) should be enough until you decide to spend 500$/mo.

ramoz · Answer

And not one MLOps data platform was named in all of the comments.

aksgoel · Answer

PyPer built on PyTorch

sandGorgon · Answer

colab or rapidminer.both of them work great for scratch projects.

srelbo · Answer

Please give us a try:https://elbo.ai - Train more. Pay lessWe want to make ML tasks as cheap and as easy as possible. We can provision GPU nodes from multiple cloud providers (today we have 4 - TensorDock, AWS, Linode and FluidStack). You don't have to sign up with them, manage keys or passwords, AMI Images, VPCs, Subnets, Firewall rules, EBS volumes or worry about Colab closing your session, network transfer bills, GPU usage approvals, opening ports, billing surprises. We take care of all that and let you focus on learning ML.I faced the same problem when I started learning ML and tried different cloud providers, Colab, Paperspace, custom PC with RTX30 series GPU. Most of the solutions were either very expensive or very complicated. I started building a tool for myself to deploy GPU nodes with a single command and thought it would be a nice product to have for other ML learners like me. 1. Sign up at https://elbo.ai for the free tier. 2. `pip3 install elbo` 3. `elbo login` with your token (from signup) 4. Jupyter Notebook in a single command in under 4 minutes (typically)- `elbo notebook` 5. Setup a GPU node to work remotely over SSH using `elbo create` 6. Submit ML tasks defined in a YAML file using `elbo run --config ` Quick start guide - https://docs.elbo.ai/quick-startCLI Reference - https://docs.elbo.ai/reference/cli-referenceLooking at our inventory today, you can get a decent Quadro 4000 GPU with 16 CPU and 32 GB memory for about $0.61 an hour. PRICE GPU CPU MEM GPU-MEM $ 0.2700/h Tesla K80 4 61Gb 12Gb AWS (spot) $ 0.6100/h Quadro 4000 16 32Gb 8Gb TensorDock $ 0.9000/h Tesla K80 4 61Gb 12Gb AWS $ 0.9180/h V100 8 61Gb 16Gb AWS (spot) $ 0.9200/h Quadro 5000 2 4Gb 16Gb FluidStack $ 0.9600/h A5000 2 16Gb 24Gb TensorDock $ 1.4900/h A4000 12 64Gb 16Gb FluidStack $ 1.4940/h A40 2 12Gb 48Gb TensorDock $ 1.5000/h Quadro 6000 8 32Gb 0Gb Linode $ 1.5140/h A6000 2 16Gb 48Gb TensorDock $ 2.1600/h 8x Tesla K80 32 488Gb 12Gb AWS (spot) $ 3.0000/h 2x Quadro 6000 16 64Gb 0Gb Linode $ 3.0600/h V100 8 61Gb 16Gb AWS $ 3.6720/h 4x V100 32 244Gb 16Gb AWS (spot) $ 3.7460/h 7x V100 6 8Gb 16Gb TensorDock $ 4.3200/h 16x Tesla K80 64 732Gb 12Gb AWS (spot) $ 4.5000/h 3x Quadro 6000 20 96Gb 0Gb Linode $ 6.0000/h 4x Quadro 6000 24 128Gb 0Gb Linode $ 7.3440/h 8x V100 64 488Gb 16Gb AWS (spot) $ 7.9200/h 8x Tesla K80 32 488Gb 12Gb AWS $ 9.8318/h 8x A100 96 1152Gb 80Gb AWS (spot) $13.0360/h 4x V100 32 244Gb 16Gb AWS $14.4000/h 16x Tesla K80 64 732Gb 12Gb AWS $24.4800/h 8x V100 64 488Gb 16Gb AWS $32.7726/h 8x A100 96 1152Gb 80Gb AWS If you just need a dedicated machine on the cloud, then I would highly recommend our provider - Tensordock (https://tensordock.com/). They have a good range of ML capable GPUs and are cheaper than many other cloud providers.We are just getting started, so if you hit any glitches or bugs, please email us at hi@elbo.aiThanks for reading till here and for your time!EDIT: Updated formatting.

What ML platform are you using?

I created something that lets you get free GPU on VS Code with Google Colab with just 1-click. Have a look at https://github.com/DerekChia/colab-vscode
This is my default go-to as a poor man ML setup, with environment and dependencies set up automatically via bash script on start up.

We use Kubeflow at our shop. If you use a managed K8s offering, it’s quite simple to manage, and of course you can deploy all your other stuff alongside it using the same tool stack.