HACKER Q&A
📣 newsoul

In 2022, what is the proper way to get into machine/deep learning?


By getting into machine or deep learning I mean building upto a stage to do ML/DL research. Applied research or core theory of ML/DL research. Ofcourse, the path to both will quite different.

Standing in 2022, what are the best resources for a CS student/decent programmer to get into the field of ML and DL on their own. Resources can be both books or public courses.

The target ability:

1. To understand the theory behind the algorithms

2. To implement an algorithm on a dataset of choice. (Data cleaning and management should also be learned)

3. Read research publications and try to implement them.


  👤 alanfranz Accepted Answer ✓
My 2c (not exhaustive for what you want to do, probably):

1) Get some statistics/probability basics. It's full of people (you can see a lot of analyses on Kaggle) that "do machine learning" but make very silly mistakes (e.g. turn categorical data into a float and use it as a continuous variable when training a model).

2) take a look at traditional machine learning approaches. Nowadays you're swamped by DL (a lot of good suggestions on this thread, I won't chime in), and you miss the fact that, sometimes, a simple decision tree, or dimensionality reduction approaches (e.g. PCA or ICA) can yield an incredible value in a very short time on huge datasets.

I had written a fairly short post about it when I finished my georgia tech path https://www.franzoni.eu/machine-learning-a-sound-primer/

3) It can take a lot of time to become effective in ML, effective as in, what you _manually create_ is as effective as picking an existing trained model, fine tune it, and use it. This can be frustrating: low hanging fruits are pretty powerful and you don't need to understand a lot about ML algorithms to pick them up.

4) Consider MOOCs or online classes. I took Georgia Tech OMSCS, I can vouch for it and some classes force you to be a data scientist and read papers as well, and you can have "real world" recognition and discuss with your peers, which is useful!


👤 ford
I work in ML - I might make 3 buckets for ML careers right now:

1. ML/DL Researcher

2. Data scientist - 20/80 engineering vs modelling

3. ML Eng - 50/50 (or 70/30) engineering vs modelling

People suggesting working in engineering to support ML are right that there's a lot of demand, but it's not what you're asking for.

Becoming an ML/DL researcher working on novel techniques or new models will be hard without academic research experience. Few companies are big enough to support true research, and the ones that are have a very high bar even for people with PHDs

What I call "data scientists" apply math/ML to real problems. The people I see here have a quantitative background like physics/math/CS. Often they have more general quantitative skills that go beyond ML. People like this will might work on things like fraud where an eng pipeline exists and small improvements in the model are valuable.

There are more of these roles than "true research" and they exist at small companies because it's applied. You can get into this with demonstrated evidence in side projects + a convincing background, but professional education might be the most sure way.

Finally - there's a lot of demand for engineers who can do both modeling and the requisite engineering. A model is a small part of what goes into a production ML feature - you need a data pipeline, automated retraining/prediction, a place to deploy the model, monitoring on eng stats + data stats, and the usual application backend/frontend to do something with the results.

You might be able to get into this with some demonstrated experience in side projects assuming you're a SWE already, and depending on your standards for where you want to work.


👤 santiagobasulto
Can I suggest a longer, but (I think) better route?

Try the Data/ML Engineer route. Instead of going directly into ML, try to work as a “supporter” of those doing ML. There’s a HUGE gap there, specially if you’re a good programmer.

There are a lot of people in the “pure” ML space, people with science background, with phDs, etc.

But there’s not enough people to support them: taking their models to producing, building their pipelines, etc.

If you get into Data/ML engineer, you’ll be working with these people and learning from them.

It’s a longer route for sure, but I think it can yield the highest success rate.


👤 bluelightning2k
FastAI. Specifically "Deep Learning for Coders" which was recently updated. https://course.fast.ai/

Do what the instructor recommends: watch each lesson once in its entirety and then re-watch it while playing along. But don't just type their commands verbatim. Try and do something slightly different.


👤 urthor
1. Clarify your goal.

Do you want to:

a) Become an academic in mathematics/statistics.

b) Become an academic in computer science with a focus on artificial intelligence.

c) Become a MLE in "regular" statistical applications. Aka bayesian classification, "core" statistical principles.

d) Become a specialized computer vision/natural language processing focused MLE.

e) Become a generalist software engineer who can whip out the above if needed.

In no way is e) the inferior option.

Generalists who can write code fast with 100% test coverage and pristine logging are by far the segment the industry has the shortest supply of.

There are TONS of math guys. Vanishingly few Principal Engineers who can write a design document and lead a project.

(Machine learning customers are OBSESSED with test coverage and verifiability. Believe it or not, multinational corporations generally don't want to unleash a {your_adjective_here}ist algorithm on the world.)

2. Study the above, properly.

To study the math, Elements of Statistical Learning/Algorithms by Goodfellow.

Start on page 1, do every second exercise. Publish a summary of every chapter you finish with your answers to GitHub.

3. Pursue your goal in a publicly verifiable manner.

See:

https://news.ycombinator.com/item?id=32071137


👤 sitkack
I am going to give you some meta commentary.

> ML/DL research

I think you should apply ML deeply to a domain you care about, but see if you can find a domain that can be both generative as well as for understanding. If you are heavy into the math and don't need a grounding basis, maybe you don't need a domain to apply the ML research to, but the best scientists had a problem they were trying to solve, not just "doing research". Basically research in strong direction, for strong purpose solving a problem.

I guessed you asked a low level mechanical question. How do I get from A to B. You might already have the domain.

So to answer the actual question, I'd pick something like MNIST (digit recognition problem) and master it by hand from scratch using multiple techniques, as many techniques as I could find. So that I am applying each algorithm to a fixed problem, so that the algorithm and then later a paper the algorithm gets embedded in my mind.

Use only cleaned datasets, spend zero energy on those a the beginning. Cleaning is a separate job and two different things don't need to be learned here. In fact stick with only industry benchmark data so you can compare your results to more papers.


👤 jamesdhutton
Andrew Ng's machine learning course on Coursera is a good introduction to the theory. https://www.coursera.org/learn/machine-learning

👤 wanderingmind
If you are looking for machine learning outside of Deep Learning, there are just 2 books

1. Elements of Statistical Learning (very frequentist treatment) by Hastie et.all [1]

2. Pattern Recognition and Machine Learning by Bishop(for a Bayesian treatment)[2]

Both are freely available online. Reading one book will get you to top 5% practitioners and reading both will get you to top 1%

[1] https://hastie.su.domains/Papers/ESLII.pdf

[2] https://www.microsoft.com/en-us/research/uploads/prod/2006/0...


👤 jstx1
Have you taken undergraduate-level linear algebra, multivariable calculus and probability? Those are the prerequisites if you want to approach things a bit more rigorously. If you've covered them, get something specific to ML.

I like Hands-On ML... by Geron as a decent intro to ML book. FastAI seems a bit overrated to me - I didn't like that it uses its own helper library or the teaching style but it obviously works for other people.

Then there's more exhaustive books on theory - Elements of Statistical Learning, Pattern Recognition and Machine Learning, Bayesian Reasoning and Machine Learning, Murphy's books on probabilistic ML etc. But obviously the theory books have a lot of overlap with each other so there will be lots of material to skip after you've read one or two of them.


👤 lexandstuff
Fast.ai [1] would still be my course of choice, especially for a strong programmer.

After completing that, I think Kaggle competitions are a great way to master your skills.

[1] https://course.fast.ai


👤 phonebucket
> Read research publications and try to implement them.

In my opinion, jump straight into this! Learn prerequisites as you need them.

I found Goodfellow's book [1] to be helpful to learn some basics.

But don't think you need to read the whole book before you start reading and implementing research papers.

If you try and build up all the fundamentals thorougly, you run the risk of going down a very deep rabbit hole e.g. learning real analysis so you can learn measure theory so you can learn measure theoretic probability theory so you can learn stats properly etc.

You can be a productive researcher and patch up the fundamentals over time.

[1] https://www.deeplearningbook.org/


👤 fxtentacle
Follow the HuggingFace Colab notebooks. They are well-written and language-related AIs are a great way to get started because you'll naturally have a feeling for what it should produce.

Afterwards, do a statistics class. Most algorithms these days are based on softmax, meaning the cross-entropy between two discrete/continuous probability distributions. There's a lot of choice in which distribution to use to model what and it will have strong effects on your gradients and, hence, training trajectory.

Concepts like shannon information and entropy are also very helpful for you to monitor training progress. Typical loss values will do exponential annealing and it'll be difficult to see further progress. But if you still reduce the bits of entropy in your classifier, learning is still going well. So you need to understand what to visualize and how to calculate that.

As for implementing research publications, maybe start with easy mode and go to paperswithcode.com . There, you will find papers AND their source code, so that you can look at how others implemented their paper.

As for FastAI and Kaggle, my personal impression is that it's mostly for toy problems. No real AI researcher would be willing to disclose their full source code to an international megacorp like H&M for a measly $15k in price money, yet similar terms appear to be the default on Kaggle:

https://www.kaggle.com/competitions/h-and-m-personalized-fas...

https://www.kaggle.com/competitions/dfl-bundesliga-data-shoo...

https://www.kaggle.com/competitions/feedback-prize-effective...

EDIT: Also, I strongly disagree with course.fast.ai on these points: "Myth (don’t need): Lots of math, Lots of data, Lots of expensive computers" To train a state of the art ASR AI, you need roughly 100x A100 for a month, 100,000+ hours of audio recordings, and math knowledge to find a maximum likelihood path through a logit matrix. Unless, of course, you're only working on toy problems.


👤 nonrandomstring
You could do worse than go back and make sure your foundation maths is solid. Revise some Discrete Maths books and understand identification, classification, sets, equivalence... Make sure you've a solid ground on concepts like dimensions, functions, differentiation, integration, extrapolation, interpolation, then toughen up your Linear Algebra, optimisation, solving, regression, before getting into approximation and gradient descent.

Sure, most of this sounds as dull as a broken clock, but in my observation it makes the difference between students who can just use machine learning tools by copying textbook cases and adopting a lot of fancy new terminology, and those that understand what they're doing.

That difference really kicks in once you get off the beaten track of popular use-cases, into applying ML to new, unproven applications. Then you need a deeper understanding of why some algorithms may be useful and others are inappropriate.


👤 samuell
There are already good recommendations here for getting into the "standard practice" of ML.

To really understand what is going on though, the path I am having some early success with (as a long time developer / data pipeline guy, but newly into the standard python / ML practice) is to run through Kochenderfer's "Algorithms for optimization" from 2019 (MIT press), including implementing the exercises, as optimization is the cornerstone of the majority of ML methods. Some of the most fun I've had in a long time.

Freely available here:

https://algorithmsbook.com/optimization/

From there on, I'm less sure, but expect I might experiment with implementing my own deep learning methods just for fun, or similar.


👤 brw95
As always, it depends.

I started out without *any* background knowledge a few years ago. Found the Data Scientist career track of Datacamp pretty helpful, since it goes beyond programming and includes the mathematical and statistical theories as well. (https://www.datacamp.com/tracks/data-scientist-with-python)

It's basic, but a solid foundation to build upon.

If you're already familiar with most of these topics, fast.ai is the way to go!


👤 NalNezumi
I can only talk from my own experience but there's other way than just straight up go to ML/DL and crank out new models. That way is to come from the application domain side. [0]

For me that was robotics, with the motivation that traditional method felt like it wouldn't scale outside static environments so I started to look in to ML/DL (Deep Reinforcement Learning really) and from the looks of it I'm not alone. [0]

Now I do research in it, without a PhD nor taking any courses in it at my masters. (except one DL course where we had to code everything including gradient flow from scratch. No framework)

Frankly, going pure DL at research level today seems like a steep uphill; the top labs and research institute(including industries) are the ones that are producing (and notably training) most of the SOTA models. Getting in to those circles are your best bet but then a PhD at a top university under a top professor is the best bet, and competition to get in to those are insane

[0] https://www.natolambert.com/writing/path-into-ai


👤 elisbce
The dilemma is that although there are recommendations like fast.ai to get your hands dirty quickly into ML/DL, none of the good AI researchers and practitioners got there via these quick tutorials. They got there via rigorous linear algebra, traditional ML, statistics and related computer science knowledge.

I would say try fast.ai for a quick taste of what ML/DL is like, and then go back to linear algebra, deep learning and stats courses from top schools while picking a personal project goal to achieve (e.g. reproducing a popular CVPR/ICML paper results or building your own XXX) Once you go through a full lifecycle of building something from scratch, you will have a much better understanding about where you are and wanna go from there.


👤 teruakohatu
I suggest first understanding machine learning in general before jumping into deep learning. The book ISLR2 is very accessible and starts with linear regression and works through many other methods including neural networks. There is a Edx course based on the book.

https://www.statlearning.com/

https://www.edx.org/course/statistical-learning


👤 obviyus
It might not be exactly what you're looking for but I stumbled upon this video by Sebastian Lague the other day: https://youtu.be/hfMk-kjRv4c

As someone who's only ever dabbled into minimaxir's GPT-2 packages this was an extremely approachable exploration (and explanation) of how a neural network works. I can't recommend it enough.


👤 machinekob
+1 for fast.ai it is and was probably best course for noobs for years (i rember 2017/18 courses as big help in DL even that i got exp with machine learning before that [about 1 year of mostly classical ML] :P)

👤 dmc-au
1. Watch the 3blue1brown neural network series for a gentle refresher on the underlying maths and the big picture of neural networks (and to be inspired): https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_6700...

2. Run through the catalogue of StatQuest videos for topics of interest in machine learning etc. This includes step-by-step maths and code explainers: https://www.youtube.com/c/joshstarmer/playlists

3. Watch more 3blue1brown videos if you need to step further back to refresh on calculus and linear algebra (that's most of the maths you'll need).

If you're hooked and can't get enough of the above content, then congratulations and welcome to the Matrix.


👤 data_maan
1. The theory by now is filling bookshelves, so forget learning about "the theory". Learn the foundation (to see where you're niche --e.g. optimization for fully connected NNs; last I heard it should be done by second-order methods if you have the computing power-- fits in) and then learn the niche's theory.

This will require at least upper undergraduate level math BTW.

2. You could get by knowing the theory in a handwavy way. Not ideal but I've seen people do it. For implementation that is enough in many case.

3. Again, "research" is too general. While you might understand some experimental ICML papers, it's very unlikely you will understand a single COLT paper if you don't know a lot of math.


👤 ravish0007
Off-topic but Norvig's talk (As We May Program) intersection of hacking-skills + statistical-thinking + domain-knowledge (https://vimeo.com/215418110)

👤 gravelc
Fastai. Updated series of lectures and notebooks for 2022. High level as well as building neural nets from scratch. Doing it at the moment and enjoying. Good as a starting point for more in depth studies.

👤 jll29
> what are the best resources for a CS student/decent programmer to get into the field of ML and DL on their own

It would be helpful to know more about your background and motivations.

Are you currently enroled in a Bachelor of Science (B.Sc.) full time stdy program at a university, and your goal is to be a research scientist (either staff scientist or professor or research fellow) in the area of machine learning?

If this is true, does your university offer a Master's program in Machine Learning, or are your grades such that you could apply for such a program elsewhere after completing your first degree? You could then enter a Ph.D. programm in machine learning itself, or in computer science with an applied ML topic such as ML for NLP (Natural Language Processing) or ML for IR (Information Retrieval = search engines) or ML for robotics etc. The choice of doctoral advisor and Ph.D. topic will steer you towards a particular direction, in which you can then find employment to conduct research under the direction of others, and potentially, become a research group leader yourself after gaining the necessary experience. Time: M.Sc.: 1-2 years; Ph.D.: 3-8 years; postdoctoral/pre-tenure time: e.g. 2-k years, depending on ability and luck/timing). It's a lot of fun to get paid for doing science, so I chose that path (but with multiple deviations due to startups and industry jobs along the way).

The more people know, the easier it is to recommend you useful materials.


👤 mitchellgoffpc
https://www.deeplearningbook.org/ and http://incompleteideas.net/book/the-book-2nd.html are excellent resources for supervised and reinforcement learning, respectively, and some knowledge of statistics and probability go a long way. But I think by far the most important thing is to just start training models, even very small ones, and developing an intuition for what works and what the failure modes are.

- Get really comfortable with matplotlib or your graphing library of choice. Plot your data in every way you can think of. Plot your models' outputs, find which samples they do best and worst on.

- Play around with different hyperparameters and data augmentation strategies and see how they affect training.

- Try implementing backprop by hand -- understanding the backward pass of the different layers is extremely helpful when debugging. I found Karpathy's CS231n lectures to be a great starting point for this.

- Eventually, you'll want to start reading papers. The seminal papers (alexnet, resnet, attention is all you need, etc) are a good place to start. I found https://www.youtube.com/c/YannicKilcher (especially the early videos) to be a very useful companion resource for this.

- Once you've read some papers and feel comfortable with the format, you'll want to try implementing something. Important tricks are often hidden away in the appendices, read them carefully!

- And above all, remember that machine learning is a dark art -- when your dataloader has a bug in its shuffling logic, or when your tensor shapes get broadcast incorrectly, your code often won't throw an error, your model will just be slightly worse and you'll never notice. Because of this, 90% of being a good ML researcher/engineer is writing tests and knowing how to track down bugs. http://karpathy.github.io/2019/04/25/recipe/ perfectly summarizes my feelings on this.


👤 maurits
Stanford CS231n: Convolutional Neural Networks for Visual Recognition [1] The assignments are excellent and will let you implement a deephish network from practically scratch, before diving into modern frameworks and applications.

This is not an instant-gratification with fancy results kind of course. But put in the work, and you will learn some very cool stuff.

[1]: http://cs231n.stanford.edu/index.html


👤 throwaway81523
Don't know about 2022, but in 2020 the fast.ai video series was a good way to get started. It has been revised a few times since then, so chances are it is still good.

If you want something more theoretical there is a book by Hopcroft et al. that was released in draft form a number of years ago. It appears to be out for real now: Foundations of Data Science, by Avrim Blum, John Hopcroft, and Ravindran Kannan. Blurb and video lectures: https://www.microsoft.com/en-us/research/publication/foundat... I just found these so haven't looked at them yet. The book draft (2014, wow) is here: https://www.cs.cornell.edu/jeh/book11April2014.pdf I didn't stick with it long enough to make much progress, unfortunately.

Kaggle.ai problems are a good set of practical projects even if you're not aiming to be competitive at them (which takes a lot of effort and resources). The Fast.ai vids are ok as preparation for them.



👤 dr-neptune
> By getting into machine or deep learning I mean building upto a stage to do ML/DL research.

> The target ability:

> 1. To understand the theory behind the algorithms

> 2. To implement an algorithm on a dataset of choice. (Data cleaning and management should also be learned)

> 3. Read research publications and try to implement them.

There are many different ways that people do ML/DL research these days. Some people do more theory-work which will necessarily be more focused on mathematics, and others do more of an applied approach which will be more focused on coding and iterating.

For theory-driven work, I think Michael I Jordans list is still pretty solid:

> https://news.ycombinator.com/item?id=1055389

I would focus on the fundamentals first though:

1. get a solid background in mathematics

  - analysis (a suggestion is Baby Rudin)

  - probability (Grimmet and Stirzaker, maybe something with measure theory after)

  - statistics (Casella and Berger or Wasserman's book is a good start)
2. get a solid foundation in statistical machine learning

  - Introduction to Statistical Learning is a fantastic start

  - Then choose 1 or both of the following:

    - Elements of Statistical Learning for a Frequentist Approach

    - Pattern Recognition & Machine Learning for a Bayesian Approach
3. get a baseline understanding of deep learning

  - the deep learning book by Goodfellow is decent

  - start reading papers here and trying to implement them
If you get through to this last step, you are probably solid enough to get a job building models. If that's the route you want, then begin iterating on learning about new approaches in papers (look for papers with code / data) and implementing them.

If you want to go the academic route, you have enough of a view of the field to begin specializing further. Choose a sub-domain and dig deep if you want to do more deep learning work. Maybe revisit Michael I Jordan's list if you're still confused about where to go. A lot of those books will feel a lot more familiar.

Best of luck!


👤 dejv
A lot of people here recommend fast.ai which is solid, but not really that useful if you want to do research.

I would start with math foundations: basic linear algebra, stats, probability and some analysis. CS undergrad level is plenty of math for start.

Then I would try to understand back prop on intimate level: learn how to calculate gradients, maybe take a look on how autograd works as well.

Then you should know a bit to pick your next steps by yourself.


👤 aqsalose
To piggyback on the OPs question, I for one think the part in parenthesis is actually most important:

>(Data cleaning and management should also be learned)

There are many students and graduates who either didn't want to do research in the first place or didn't get that research grant or position and looking to get employed in private sector with their degree. Many universities and colleges have now also retooled some of their statistics degrees as dedicated "data science" curriculum who either know basics of ML/DL or have the prerequisite background to learn quickly.

However, in my experience (I am extrapolating from my own past job search experiences) while "understanding theory behind the algorithms" counts still for something, it is much less than one would think. Familiarity with the software technologies and practical implementation is what counts much more. This includes not only "data management", a phrase which makes it sound like the data simply exists somewhere and only needs to be managed (not unlike a Kaggle competition), but also the data pipeline management from generation/collection to analysis and communication of the results, and deploying the software the implements it all, and so on. I suppose (never been on that end of the interview table) given any two candidates to interview, it is very difficult to evaluate how deeply one understands theory of some algorithm compared to other if they both demonstrate some basic understanding (and what is the practical use of possible difference in insight from such differential, anyway?). Likewise, I assume it is somewhat easier to gauge whether someone seems to able start delivering results or contributing to their on-going work quickly if they have the relevant technical skills and/or domain knowledge.


👤 mlubbad
Entering the world of machine learning is quite the experience. And as any explorer is aware, a compass can occasionally be useful for determining whether you are traveling in the proper direction.

You should use this video's title as a compass even though it says "machine learning roadmap." Investigate it, follow your interest, pick up a new skill, and then put what you've learned to use when determining your next moves. Investigate it, follow your interest, pick up a new skill, and then put what you've learned to use when determining your next moves.

Video https://www.youtube.com/watch?v=pHiMN_gy9mk. Interactive Machine Learning Roadmap - https://dbourke.link/mlmapttps://www.charliewalks.com to read it.


👤 kingcai
My best advice is to find people who do what you want to do and try to learn as much as possible from them. If you're interested in doing ML/DL research I think the best way to get into the field is to reach out to professors. I studied ML/DL (books, projects, classes, reimplementing papers) for several years in undergrad, but discussing and debating ideas is the one thing that took my understanding to a much deeper level. A good professor will also point out gaps in your knowledge that you might be missing.

A second bit of advice: Programming (and execution) skills are IMO heavily undervalued by people looking to get into ML. The faster you can write code, debug, and implement new things, the easier it is to produce good research.

Some books I liked: PR & ML (Bishop), Deep Learning Book (Goodfellow), AI: A Modern Approach (Norving), Elements of Statistical Learning (Friedman)


👤 bo1024
> I mean building upto a stage to do ML/DL research. Applied research or core theory of ML/DL research.

The vast majority of people who do this have graduate degrees. I'm biased, but I think getting a graduate degree in the subject would be the default suggestion. Are you considering it?


👤 codethief
There is the "high bias low-variance introduction to Machine Learning [for physicists]" -> https://news.ycombinator.com/item?id=17772211

Quotes from the comments:

> For all my hacker news peeps that wants to learn ML and/or DL, you need to drop everything right now, go print this on the office printer, and sit outside with coffee for the next two weeks and read through this entire thing. Turn off the computer and phone. Stop checking HN for two weeks. Trust me, nothing better than this will come around on HN anytime soon.

> The authors are wrong to label this book as useful only to people with a physics background, and in fact it will be useful for everyone who wants to learn modern ML.


👤 f1shy
Read “AI a modern approach“ and do the MIT and Stanford courses that are available in Youtube. Then you can go deeper into the branches presented in the book and courses. The problem I’m seeing now, ist that everybody seems to think AI/ML is NN. Nothing further from truth!

👤 pfarrell
There was a thread a few days which has some recent comments on this topic,”How do you break into a career in machine learning? (2020)” [0]

0: https://news.ycombinator.com/item?id=32342925



👤 adamsmith143
If you really want to do research then no question you need to go to grad school for at minimum an MS. If you are still a student that means getting in touch with ML professors at your school and trying to get published before you graduate. Top ML programs in the US are extremely competitive and you likely don't stand a shot of getting in without a few NeurIPS/ICML/CVPR papers.

If you just want to work as an ML Engineer then take as many courses as you can on the subject before you graduate and get internships/apply to jobs. Nothing special here.


👤 tunesmith
I've thought about going in this direction multiple times, but I have this impression that it's pretty much unavoidable that you'd be working with annoying ETL processes and "cleaning data" (parsing, transforming from one set of columns to another, etc) over and over again. Is this true? I hate that stuff, but I'm particularly interested in graphical data structures and I even like probability and statistics and linear algebra, so I've been torn.

👤 m3at
There's so much resources it's hard to pick… If you're already studying CS and like to learn by building, I'd recommend to do that for ML too!

Pick a real problem, try to build a ML solution for it and while doing so keep a list of things you'd like to dig deeper into. Then go back to that list and pick one item to study, and iterate.

Happy to have a chat and give you specific pointers if you'd like (email in profile), I got my master in ML in 2016 and applied it in the industry since.


👤 schiptsovgithub
There is a early recording of Andrew Ng Stanford course before he became the star of the field.

It is mostly math behind ML.

There are also his lecture notes from 2014 I suppose.

It is really all one need to know. Machine learning or Deep learning is just pattern recognition which uses NNs as the basis for representation foe the "matcher".

The real problem of ML is to find someone who would pay you for this because we are in a bubble.

It is math and algorithms that matters and there are just a few of them. All the tooling could be mastered in a month.


👤 asellke
Lots of good advice here, but I'd put Google's free course, "AI for all humans: A course to delight and inspire!" out there as a wonderful entree: https://cloud.google.com/blog/topics/developers-practitioner...

👤 k_sze
If you want a good theoretical foundation based on maths, Caltech's Learning From Data course is good: https://work.caltech.edu/telecourse

You need to be not afraid of doing proofs of theorems (most of them have to do with stats because machine learning is basically stats on steroids).


👤 bjourne
That's a multi-year process. Being proficient with TensorFlow and PyTorch is not sufficient to do useful research. I suggest you begin by implementing a neural network from scratch in your favorite programming language. Writing up the code for matrix multiplication, dot product, back propagation, etc. will teach you a lot.


👤 logicchains
I'd highly recommend reading this book, which is available free online: https://www.deeplearningbook.org/ . It covers the foundations up to 2016 very well, with useful references for understanding the underlying math/theory.

👤 thrown_22
Learning how to do machine learning and learning how to clear data is a bit like asking what the best way to become an award winning author with very neat penmanship is.

One has nothing to do with the other.

That said data engineering at scale pays a lot more than deep learning but is also a lot less fun. Figure out which you'd rather do.


👤 jokoon
I recently asked reddit how to label images by downloading a pre trained network that used image net, I got no answers.

I don't know how long it would take to train such network with a cheap laptop.

There are tutorials, but I don't see any cookie cutter thing.

I thought there would be demos for this, since image labeling is an old problem.


👤 yodsanklai
If you want to do research, I'd recommend an academic path. Get an MS from a reputable university. If you can't do that, try to follow their syllabus on your own (textbooks, articles).

MOOC are a starting point, they often offer classes with little pre-requisites, but that'll only get you so far.


👤 meltyness
Describe the new calculus that leads to the design of multihead attention units, identify 3-4 other useful novel units using your pattern, propose hardware that can perform those operations with greater speed or efficiency.

👤 alex-gdv
imo the best way to learn is to jump into a project. if you want some basic intuition on neural networks, 3blue1brown has some great videos. for everything else, just google things and you'll find lots of resources. medium and towardsdatascience articles have saved my life so many times while working on ML/DL uni projetcs. if you're looking to play around with current research, use https://paperswithcode.com.

👤 peterhung
If you want to quickly review fundamental stats, you can read: ThinkStats book.

After that, I'd recommend: Statistical models: Theory & Pratice by Freedman.


👤 mhb
A nice mix of application and background: https://pyimagesearch.com/

👤 qwertyuiop12
The Machine Learning course by Andrew Ng in Edx.

A basic ( and not at all! ) approach to know if it is funny for you


👤 sabr
On a related note, to get into ML/DL you'll have to read a lot of arXiv papers. I built Smort.io to easily annotate and collaborate on arXiv papers.

Just add smort.io/ before any arXiv URL to read it in Smort.

Demo: https://smort.io/demo/home


👤 pemaled
try to join all these competition because it will keep you busy

👤 frontman1988
It's probably a bit too late to get into ML. It's oversaturated with a lot of wannabe "Machine Learning enthusiasts" If you still want to get into the field a masters/phd is a much safer way to get proper ML jobs and then prosper in them.

👤 roschdal
Why would you possibly want to do such a thing towards yourself?

👤 jack_riminton
Fast.ai