What in your opinion is the best way to invest the time/energy?
1. Build small projects (build what?)
2. Read blogs/newsletters (which ones?)
3. Take courses (which courses?)
4. Read textbooks (which books?)
6. Kaggle competitions
7. Participate in AI/ML forums/communities
8. A combination of the above (if possible share time % allocation/weightage)
Asking this in general to help good SE people build up capabilities in ML.
The single thing which I learned the most from was implementing a paper. Lectures and textbooks to me are just words. I understand them in the abstract but learning by doing gets you far deeper knowledge.
Others might suggest a more varied curriculum but to me nothing beats a one hour chunk of uninterrupted problem solving.
Here are a few suggested projects.
Train a baby neural network to learn a simple function like ax^2 + bx + c.
MNIST digits classifier. Basically the “hello world” of ML at this point.
Fine tune GPT2 on a specialized corpus like Shakespeare.
Train a Siamese neural network with triplet loss to measure visual similarity to find out which celeb you’re most similar to.
My $0.02: don’t waste your time writing your own neural net and backprop. It’s a biased opinion but this would be like implementing your own HashMap function. No company will ask you to do this. Instead, learn how to use profiling and debugging tools like tensorboard and the tf profiler.
I assume you’re talking about the latest advances and not just regression and PAC learning fundamentals. I don’t recommend following a linear path - there’s too many rabbit holes. Do 2 things - a course and a small course project. Keep it time bound and aim to finish no matter what. Do not dabble outside of this for a few weeks :)
Then find an interesting area of research, find their github and run that code. Find a way to improve it and/or use it in an app
Some ideas.
- do the fast.ai course (https://www.fast.ai/)
- read karpathy’s blog posts about how transformers/llms work (https://lilianweng.github.io/posts/2023-01-27-the-transforme... for an update)
- stanford cs231n on vision basics(https://cs231n.github.io/)
- cs234 language models (https://stanford-cs324.github.io/winter2022/)
Now, find a project you’d like to do.
eg: https://dangeng.github.io/visual_anagrams/
or any of the ones that are posted to hn every day.
(posted on phone in transit, excuse typos/formatting)
Leveraging existing LLM technologies and putting them in software where regular people can use them and have a great experience is important, necessary work. When I studied CS in college the data structure kids were the “cool kids”, but I don’t think that’s the case in ML.
The daily practice is to sketch applications, configure prompts and function calls, learn to market what you create, and try to create zero to one type tools. Here’s two examples I made, one where I took the commonplace book technique of the era of Aristotle and put it in our modern embeddings era [1] and one where I really pushed to understand the pure MD spec and integrate streaming generative models into it [2]
If you don't have a solid enough footing to get a job in the field yet, the next best thing in my opinion: find a passion project and keep cooking up new ways to tackle it. On the way to solving your problem, you'll undoubtedly begin absorbing the tools of the trade.
Lastly, consider going back to school (a Bachelor's or Master's, perhaps?). It'll take far more than 1 hour/day, but I promise you, you'll see results far faster and far more concretely than any other learning strategy.
Good luck!
Context: I've been a Researcher/Engineer at Google DeepMind (formerly Google Brain) for the last ~7 years. I studied AI/ML in my BS and MS, but burnt out of a PhD before publishing my first paper. Now I do AI/ML research as a day job.
0) Learn the pre-requisites of math, CS, etc. That usually means calc 1-3, linear algebra, probability and statistics, fundamental cs topics like programming, OOP, data structures and algorithms, etc.
1) Elementary machine learning course, which covers all the classic methods.
2) Deep Learning, which covers the fundamental parts of DL. Note, though, this one changes fast.
From there, you kind of split between ML engineering, or ML research.
For ML engineering, you study more technical things that relate to the whole ML-pipeline. Big data, distributed computing, way more software engineering topics.
For ML research, you focus more on the science itself - which usually involves reading papers, learning topics which are relevant to your research. This usually means having enough technical skills to translate research papers into code, but not necessarily at a level that makes the code good enough to ship.
I'll echo what others have said, though, use to tools at hand to implement stuff. It is fun and helpful to implement things from scratch, for the learning, but it is easy to get extremely bogged down trying to implement every model out there.
When I tried to learn "practical" ML, I took some model, and tried to implement it in such a way that I could input data via some API, and get back the results. That came with some challenges:
- Data processing (typical ETL problem)
- Developing and hosting software (core software engineering problems)
- API development
And then you have the model itself, lots of work goes toward that alone.
I worked for a major company on an ML project for 2 years. By the time I left, I realized that:
1: The project I was working on has no improvement over ordinary statistical methods; yet the ability for people to understand the statistics (over the black box of ML) meant that the project had no tangible improvement over the processes we were trying to replace.
2: A lot of the ML I was working on was a solution in search of a problem.
I personally found the ML system I was working on fascinating; but the overconfidence about what it can infer, and the way that non-developers thought ML could make magical inferences, frustrating.
---
One other thing: Make sure you understand how to use databases, both SQL and non-SQL. In order to use ML effectively, you will need to be excellent at programming with large volumes of data in a performant manner.
Things like ML ops, application of DevOps, testing and ci/cd in the ml space, how to train across multiple gpus, how to actually host an LLM especially at scale and affordably.
In my experience there are hundreds of candidates coming from academia with strong academic backgrounds in ML. There are very few experienced engineers available to help them realise their ambitions!
2. Keep a couple of Mathematics/Statistics books handy while you are going through the above. When the above book talks about some Maths technique you don't know/understand you should immediately consult these books (and/or watch some short Youtube videos) to grasp the concept and usage. This way you learn/understand the necessary Mathematics inline without being overwhelmed.
This is the simplest and most direct route to studying and understanding AI/ML. Everything else mentioned in this thread should only come after this.
Personally I think the best way to develop per unit time is probably to try to re-implement some of the big papers in the field. There's a clear goal, there are clear signs of success, there are many implementations out there for you to check your work against and compare and learn from.
Good luck!
https://docs.google.com/forms/d/e/1FAIpQLScbWN3qwqeIc0b1cCRq...
Note this is for absolute LLM beginners, not if you’re already working with LLMs -- but even some of these folks have found it useful!
Hope you find this useful.
We work together to build AI models in our favorite programming languages.
- Following along with Karpathy's videos, which has been mentioned: https://karpathy.ai/zero-to-hero.html
- About to follow along with CS 231n, also mentioned: https://www.youtube.com/watch?v=NfnWJUyUJYU&list=PLkt2uSq6rB...
- Trying ideas and theories in a Jupyter notebook
- Reading papers
I would agree with other commenters that recommend learning how to implement a paper. As someone who barely managed to get their undergraduate degree, papers are intimidating. I don't know half the terms and the equations, while short, look complex. Often it will take me several reads to understand the gist, and I've yet to successfully implement a paper by myself without checking other sources. But I also know that this is where the tech is ultimately coming from and that any hope of staying current outside of academia is dependent on how well I can follow papers.
I've been doing this for about a month now, and I feel I definitely understand more of the theory of how most of this stuff works and can train a simple attention based model on a small-ish amount of data. I don't feel I could charge someone money for my skills yet, but I do feel that I will feel ready with about 6 months - 1 year of doing this.
I can do my job but I always wanted to learn and understand more. Family circumstances mean I can't afford to quit my job or go to school.
Learning how to effectively prompt an LLM is an enormous space in its own right - and there's no shortcut for it, you have to actively play with the things.
I've been using them constantly for over a year at this point and I'm still figuring out new tricks and strategies all the time.
Weirdly, knowledge of Machine Learning isn't actually that relevant to getting good at using LLMs to solve problems and build software.
Knowing how to train your own neural network will do little for your ability to build astonishingly cool software on top of existing LLMs.
Knowledge of how LLMs work is useful, because it can help you prompt them more effectively if you understand their limitations, have an idea of their training data etc.
I've seen people (who I respect) argue that deep knowledge of ML can be a disadvantage when exploring LLMs, because it can limit the way you think about and interact with them. Weird but possibly true!
One is to reproduce recent papers for which the data is available and especially if the source code is available. Don’t look at their source code initially but use it if you get stuck as a debugging method (my model isn’t converging, do they get the same gradients given the same data?)
Another is a fun idea to play with: sports data sets. Of course you have to like at least one sport but there’s lots of sports data out there that is easy to download in convenient formats (especially for baseball, where professional statisticians have been employed to do analysis since at least the 50s, but afaik all the major sports have good records these days) and you can go a long way with simple models. I’ve wasted a lot of time on the weekend coming up with fun baseball analyses.
For courses, Andrew Ng’s classes have always been good, starting with his Stanford ML class, Coursera deep learning classes, and now his short mini-classes on being an effective LLM practitioner.
Textbooks on LLMs are likely to quickly be out of date, at least I struggle to keep my LangChain/LlamaIndex book current.
My advice to you is to try to get into a paid AI job as your highest priority, and that is a lot of work: identifying possible employers, preparing for interviews, and having persistence. Some of the interesting AI work you might find will not be with “tech” companies, but rather small or medium profitable businesses that need to use ML, DL, LLMs lightly - just a small part of their successful businesses.
I would not recommend doing many things at once.
I am a Python developer who has never worked on ML/data science before, I am mostly into Data Engineering
The deeplearning.ai math basics for deep learning, seems self-contained. MiniTorch repo (implement your own tiny torch) seems also helpful to understand what goes on during training. MinGPT repo (to understand a basic version of GPT model structure) Dive into deep learning (textbook avail online, more focused on practical DL)
Since you mention SE, I'd choose a mini project in an area you love. The tooling you will learn along the way.
An hour a day is paradoxically not nearly enough, yet also a serious time investment of your day.
Maybe start by asking what exactly you want to learn? Applying ML to a practical problem, in user app? The math? The ideas?
Original transformer paper: https://arxiv.org/abs/1706.03762
Illustrated transformer: http://jalammar.github.io/illustrated-transformer/
Transformer visualization: https://bbycroft.net/llm
minGPT (Karpathy): https://github.com/karpathy/minGPT
---
Next, some foundational textbooks for general ML and deep learning:
Elements of Statistical Learning (aka the bible): https://hastie.su.domains/ElemStatLearn/
Probabilistic ML: https://probml.github.io/pml-book/book2.html
Deep Learning Book (Goodfellow/Bengio): https://www.deeplearningbook.org/
Understanding Deep Learning: https://udlbook.github.io/udlbook/
---
Finally, assorted tutorials/resources/intro courses:
Beyond the Illustrated Transformer: https://news.ycombinator.com/item?id=35712334
AI Zero to Hero: https://karpathy.ai/zero-to-hero.html
AI Canon: https://a16z.com/2023/05/25/ai-canon/
LLM University by Cohere: https://llm.university/
Practical Guide to LLMs: https://github.com/Mooler0410/LLMsPracticalGuide
Practical Deep Learning for Coders: https://course.fast.ai/Lessons/part2.html
---
Hope that helps!
1. Build small projects in the area you have passion about, examples: try to beat benchmark, classify news and track stories of your interest, build auto manga generator
2. Kaggle competitions: not sure if employers are looking at this though
3. Write blog about your journey.
essentially, don't get paralysed on designing the perfect path on how to invest time/energy. just focus on putting in the hours everyday.
up to your motivation, doing basic level courses first (as shared by others) and then tackling your own application of the concepts might be the way to go.
i also observe the need for strong IT skills for implementing end-to-end ml systems. so, you can play to your strenghts and also consider working on MLOps. (online self-paced course - https://github.com/GokuMohandas/mlops-course)
i went back to school to get structured learning. whether you find it directly useful or not, i found it more effective than just motivating myself to self-learn dry theory. down the line, if you want to go all-in, this might be a good option for you too.
- AI/ML is diverse, with data scientists specializing in different areas. I know AI experts who have still not delved into LLMs; they have their specific focus areas. AI/ML skills encompass a wide range of topics, and data scientists often have specific focus areas. Continuous exploration and reading are crucial. Resources like paperswithcode.com are valuable for discovering new research areas and domains.
- While time-consuming, Kaggle offers exposure to robust modeling and validation skills. These skills are critical, though they are only a fraction of what's needed for real-world projects. It's beneficial to expand beyond these skills. This being said, it does give bragging rights. I've seen company founders, like those at H20.ai, often highlight their Kaggle Grandmasters.
- My current role is at Pathway.com. Over 80% hold of my colleagues PhDs, and our CTO has co-authored with folks like Geoff Hinton and Yoshua Bengio (I find that cool actually :)). But this environment may reflect my bias towards academic research. This being I said, I believe that strong foundational understanding is essential and also valued, especially when tackling complex challenges.
- Active participation in forums and communities related to the frameworks you use is highly recommended, like TensorFlow User Groups. At Pathway.com, we welcome those interested in stream data processing to our community. Engaging in these forums offers the chance to receive support from the original creators and leading community members. Other notable communities include DataTalks.Club and MLOps.Community.
For the least technical, I suggest MattVidPro AI on YouTube.
For the slightly more technical, I suggest 1littlecoder also on YouTube.
Just acquire skills for the sake of it?
1. Don't waste your time on courses [not after you know the basics]
2. Kaggle Competitions [Featured ones] worked for me
3. Read blogs/newsletters - Tldr AI comes with new research and many open-source projects, I have personally starred a ton of repos and it's totally amazing, then there's bizzaro devs, data elixir, Hackernews newsletter which combines top links, You can read Lilian Weng if you have strong fundamentals, Jay Alammar
4. Additionally I took Udacities nano degrees, they were nice, you can try it, for RL and Self Driving cars at least..
Best Jay