HACKER Q&A
📣 tayo42

Is making a self taught transition to AI/ML related fields possible?


Graduate school looks to be like way to much of a time and money commitment right now. A ton of this academic content seems to be free online anyway. I got into software with free content and classes online.

I'm wondering if anyone has had success moving into this field, for a generalist engineer? I'd imagine advanced degrees aren't required for everything? ML infra and stuff, perf/optimization work etc... Maybe learning materials, resume and interview advice etc? Thanks in advance if you have an interesting answer!


  👤 benreesman Accepted Answer ✓
It’s very tough to do research-level ML without the whole track. There are exceptions, but if you want to publish papers at DeepMind, you probably want the graduate education.

But if you, like me, are happy to be an “understand and implement the paper” person instead of a “co-author the paper” person, that is eminently achievable via self-study and/or adjacent industrial experience. In fact, it’s never been easier as world-class education is more available on YouTube every day.

3Blue1Brown covers all the linear algebra, calculus, information theory, and basic analysis you need to get started. Ng and Karpathy have fantastic stuff. Hotz is writing a credible threat to PyTorch on stream/YT for the nuts and bolts accelerated computing stuff. Kilcher does accessible reviews of modern stuff. The fast.ai stuff is great.

This is all a lot easier if you can get a generalist/infrastructure role that’s ML-adjacent at a serious shop (that’s how I got exposed), but there’s enough momentum in open source now that even this isn’t a prerequisite.

I say fucking go for it, and if you want any recommendations on books or need someone to ping when you get stuck, feel free to email.


👤 srvmshr
I have a ML PhD - maybe slightly qualified to answer this question.

Yes, absolutely doable. Immerse yourself into learning things well. Learn the basics. Don't do course hopping & book hopping - pick a rock solid book or lecture series, devour it & get building. The last part is the most important part.

People with ML MS/PhD only have an additional (1) degree & (2) networking. If you invest time, you can overcome (2) by asking good questions in Twitter/Reddit & making connections. I still do it after finished my degree. Twitter is the Linkedin for ML.

As for (1), YSK that most advisees are getting advised by professors who made it big before deep learning took off. So everyone is still on the learning curve of sorts - advisors, advisees and your peers. Sometimes student's intuitions could be better than the professor's. Don't sweat over it. Focus on building.


👤 Simon_O_Rourke
As a hiring manager for AI/ML, if you can get in the weeds and talk the talk with data structures and pipelines, as well as advanced SQL then you're in with a good chance of making the final list. If you're a humanities major who's just done a few Coursera courses in AI at the weekends, then sorry no.

Software engineers by and large, can make great AI/ML practitioners - the specialization is a smaller leap than say from business analysts or any folks who're less likely to be able to install their own OS or automate task with scripting.


👤 coderintherye
Assuming you are good with a non-research role, then yes.

Easiest way to get into that work, in my opinion, would be to take a data engineering job on a team that has an AI/ML capacity and then start learning from that team and taking on some of the AI/ML tasks directly. Alternatively, you could take a role at a smaller business that needs a generalist but also wants to invest in AI/ML (though in this case you will be more on your own to self-learn and it won't work quite as well for stepping stone into a more pure AI/ML role).


👤 moandcompany
I'd argue that the majority of people in the AI/ML field professionally are self-taught or learned on the job.

👤 jvans
Yes, definitely. I am a generalist SWE, learned ML through Jeremy Howard's course and transitioned to the ML team at my company. 98% of ML in the real world is building a robust software system to support the models. The last 2% is actually modeling and simple models go a long way.

👤 keeptrying
Yes definitely possible. I did this 6 years ago.

1. Find the best 1,2 courses for AI/ML At the time the Udacity self driving course was a great course from a basic Udacity ML intro course to a full system that

2. Allocate 2-4 hours a day for this. This is a heavy course and a lot to learn so you have to work really hard to get this done.

3. Final project should be impressive to people int he field. So for example I implemented a YOLO alternative from their paper. You’ll have to do something similar and show results.

Then getting a job is a completely different skill and you’ll be looking at a job on the margins. Keep backup options in lower pay jobs in startups or in non tech companies if your dream jobs don’t pan out.

It could take 6 months to 9 months to understand the content and then 6-9 months to get a job. IF you are okay with that … then do it.

Having ML + software eng background is really good spot to be in.


👤 jmartin2683
I was self-taught the entire way and am currently working on the application/inference architecture side of things. Started in 2017 or so just messing with tensorflow for proof of concept stuff, then years later now we’re using the inference architecture to deploy and build apps around models other teams have built.

It’s all very cool and you definitely don’t need anything at all to do any of it.


👤 araes
A suggestion that's slightly different from the others. Because of all the emerging fields that are being considered with the new capabilities of near instant "quality" text/art production, there are also a lot of opportunities in areas that effectively did not exist a year ago. We rapidly went from "AI/ML will be really cool someday, yet its only good for chess" to "AI/ML is better than most non-specialists at really hard tasks."

Arguably, one of the best times to get involved. You may not want to fight about making the 99.9++% accurate system, yet a bunch of non-specialists were the first to extend some of these models, and actually apply them on non-toy problems.

There are also a lot of sites/guides/walkthroughs that did not exist 6 months ago where you can rapidly get a feel for "what can this actually do?" [1][2][3][4][5][6]

[1] https://huggingface.co/stabilityai/stable-diffusion-2-1?text...

[2] https://rerenderai.com/

[3] https://open-assistant.io/chat

[4] https://hacks.mozilla.org/2023/07/so-you-want-to-build-your-...

[5] https://writings.stephenwolfram.com/2023/07/generative-ai-sp...

[6] https://www.convex.dev/ai-town


👤 bakuninsbart
You can get in via data engineering, but in a sense you are relegated to a support position. In my experience, data science has been uncharacteristically strict on degrees compared to other CS-like fields. Unless you are very passionate about it, I'd avoid that part of the industry if you can.

👤 buildbot
Going against the grain here - you will have a much better time finding actual research job if that is your goal with a research degree. A Thesis based masters or PhD shows you can organize and present even incremental work in a useful way, moving the "state of the art" slightly vs. engineering around existing tools. It's 100% a subtle sliding scale - some engineering becomes research and much of research is engineering.

ML Infra is just Infra, with a different set of needs and issues than say, Spring Boot infrastructure or something. Self taught here is fine, and honestly I'd trust a software engineer with basically no experience to handle ML infra better than an ML researcher.


👤 madrox
I made this transition in my late 20s after ~10 years of generalist engineering. I don't believe I could have done it without going back to school. After a decade writing code, I found stochastic thinking to be non-intuitive, and school helped me practice a lot of concepts. However, that was before a lot of online ML material became available. YMMV

I think of it like the gym. You could get in there and start something tomorrow, but unless you've been taught good form there's a good chance you'll injure yourself.


👤 barefeg
The answers so far tend to be either: yes, it’s very easy. Just follow a boot camp or books, transition at work by working close to a DS/ML team, etc., or no, you need lots of background knowledge, track record of publications, etc.

I think the problem seems to be that there’s not a clear need for engineers with lower levels of expertise in some of the fields; that is people with less than an MS degree in that specific ML sub field.

If the field transcends for long enough, and urgent demand becomes so mainstream, then I imagine hiring managers will have to invest more resources in hiring people with less experience and training them in the job. Similar to how developers are hired fresh out of short boot camps these days.

Therefore the easiest way to transition would be by acquiring practical an theoretical knowledge using the strategies given in the other answers and then applying to ML teams where there’s enough demand for them to want to train you in the job. Of course that’s easier said than done. It might be interesting to hear some thoughts on whether this is already happening in certain fields.


👤 the_scoop
Yes it's possible - I've spent the last 4ish years of my career making this transition and have only recently "made it" into a role that affords me the opportunity to spend most of my time as an AI/ML practitioner. Unfortunately there is still a ton of gatekeeping in this industry where hiring managers often want to see Phd's and work published in journals when the positions they are hiring for have little to do with these experiences. The upside is that industry seems to be gradually changing as companies begin to understand that fundamental engineering skills are far more important then deep theoretical knowledge of the theory and math.

👤 heyitsguay
I talk to a bunch of engineers with this question at an SF AI meetup, and I'm really curious to hear what the community consensus is but from my perspective, it's tough and getting tougher.

The issue isn't that it can't be done -- in fact, the greatest need right now is for engineers who can come in and build rock solid real world applications on top of commodified neural network architectures and weights, not PhD scientists. Your business might not even use its own ML model! You might just be calling an API.

The challenge is that for a few reasons, it's a very crowded market right now. A lot of people want to make a move into AI, yet for all the hype, the space of viable commercial applications that will survive without indefinite VC funding remains kinda small. Look how AVs are doing after billions and billions in funding chasing one of the most lucrative commercial possibilities imaginable. There's really cool stuff happening industry-wide, and commercial potential is growing, but nowhere near as fast as the cultural hype that has infected certain parts of tech space these past 12 months or so. Plus, many experienced ML engineers and scientists have been dumped back into the job market due to layoffs. So from the hiring side right now, for every AI posting there are tons of applications that have the cool portfolio, and then also a relevant degree and/or prior experience.

That's what you're competing against, so if you're going on portfolio alone it's got to be really outstanding. Way beyond doing the homework for a free course. Learning how to build an ML service that solves an actual problem in the real world reliably enough that you can actually use it should be the goal.

If you happen to be employed at a company where there is a need for an ML engineer in some capacity but no availability (hiring is expensive!), you can try stepping up to help out. Hiring challenges aside, it is absolutely possible to learn on the job the engineering skills needed to, say, build ML infra or work as part of an MLOps team. I recognize that's sort of just up to circumstance though. If you look for a new job, be a little wary of any that want to hire you for a more mundane task (like data entry/cleaning/labeling) with a promise of getting to do the ML engineering stuff, too, "eventually". Such roles do exist but it's also a bait-and-switch tactic.

Anyway, that's what I've got as someone who has been thinking about how to help people looking to do what you're doing, but i hope this thread turns up more ideas too.


👤 amayne
I started working at a prominent AI company in 2020 with no formal background in AI/ML. I was able to bring outside understanding of language and theory of mind into use with large LLMs and create a role as a prompt engineer. My résumé was basically doing interesting things and helping make their models more useful.

Even though I'm not a formal researcher, I've been able to contribute to research projects and be included in papers because the field is so new.

The most important criteria I look for when I interview applicants is what they have built. Github repos, papers even cool Product Hunt projects can have impact.


👤 visarga
It's even easier now than 12 years ago when I started. Better tooling, more efficient pre-trained models, and the whole prompting paradigm that makes a month long task doable in a day. You don't have to label as much as before, or not at all.

When you train small models on small datasets you get very bad out-of-distribution results, but when you use these LLMs they have already seen everything on the web so they are not as often OOD.


👤 bmitc
A side but related question: what are some interesting industrial applications of machine learning that are not related to advertising, consumption, back office efficiency, and other such things? In other words, what makes machine learning an interesting field to work in?

👤 neodypsis
There are some bootcamps available online (EdX), for example:

https://www.edx.org/course/columbia-engineering-machine-lear...


👤 Zetice
Just don’t get stuck thinking about it, that’s going to be your biggest barrier, realistically.

👤 didip
In ML Ops and Infrastructure? Very much so.

Actually coming up with a new model? I am not sure, maybe not.


👤 kebsup
I only had AI at my bachelor's, but we've learned all of it from YouTube/Coursera anyway, because the learning materials are simply better than the lectures 99% prefessors give.

👤 flimsypremise
Making a self taught transition to any field is possible. You just have to be interested enough and motivated/disciplined enough to plow through the tough parts of the learning curve.

👤 thom
Worst case is you learn lots of interesting stuff that makes it easier to at least integrate other people’s work, so what’s the risk of diving in?

👤 _pdp_
Don't accept any advice from anyone who have not done this transition themselves, no matter their credentials.

That being said, let's not fool ourselves that going from knowing nothing to being pretty good at the stuff requires some formal, structured education. Programming is much more of a craft than it is either an art or a science and the field of AI is no exception.


👤 bjornsing
If you can reliably publish at NIPS all will be forgiven.

👤 BasedAnon
yeah i did this, just work as a software engineer at an AI start up and you'll inevitably have to get your hands dirty with AI stuff

👤 calebm
Yes.