HACKER Q&A
📣 quibono

How to get back into AI?


I was involved in machine learning and AI a few years ago, mainly before the onset of the new diffusion models, large transformers (GPT*), Graph NNs and Neural ODE stuff.

I am comfortable with autograd/computation graphs, PyTorch, "classic" neural nets and ones used for vision-type applications, as well as the basics of Transformer networks (I've trained a few smaller ones myself) and RNNs.

Do you know of any good resources to slowly get back into the loop?

So far I plan on reading through the original Diffusion/GPT papers and start going from there but I'd love to see what you think are some good sources. I would especially love to see some Jupyter notebooks to fiddle with as I find I learn best when I get to play around with the code.

Thank you


  👤 jdeaton Accepted Answer ✓
I am an ML researcher working in the industry: by far the most effective way to maintain/advance my understanding of ML methods is implement the core of an interesting paper and reproduce (some) of their results. Completing a working implementation really forces your understanding to be on another level than if you just read the paper and think "I get it". It can be easy to read (for example) a diffusion/neural ode paper and come away thinking that you "get it" while still having a wildly inadequate understanding of how to actually get it to work yourself.

You can view this approach in the same way that a beginner learns to program. The best way to learn is by attempting to implement (as much on your own as possible) something that solves a problem you're interested in. This has been my approach from the start (for both programming and ML), and is also what I would recommend for a beginner. I've found that continuing this practice, even while working on AI systems professionally, has been critical to maintaining a robust understanding of the evolving field of ML.

The key is finding a good method/paper that meets all of the following

0) is inherently very interesting to you

1) you don't already have a robust understanding of the method

2) isn't so far above your head that you can't begin to grasp it

3) doesn't require access to datasets/compute resources you don't have

of course, finding such a method isn't always easy and often takes some searching.

I want to contrast this with other types of approaches to learning AI with include

- downloading and running other people's ML code (in a jupyter notebook or otherwise)

- watching lecture series / talks giving overviews of AI methods

- reading (without putting into action) the latest ML papers

all of which I have found to be significantly less impactful on my learning.


👤 macrolime
I'm in the same boat kinda, but even more outdated on some parts. I had some AI specialization back in college, but that was before deep learning was even a thing, so we did stuff like self-organizing maps and evolutionary algorithms, but it wasn't really all that useful for much back then. Been following deep learning from the sidelines, but for work my AI work has been restricted to GOFAI until recently.

Some of the stuff I'm currently reading/watching or have recently

Practical Deep Learning, though it sounds like you may know this stuff already (https://course.fast.ai/)

Practical Deep learning part 2, more about diffusion models. Full course coming early next year (https://www.fast.ai/posts/part2-2022-preview.html)

Hugging Face course (https://huggingface.co/course/chapter1/1)

Diffusion models from hugging face https://huggingface.co/blog/annotated-diffusion https://huggingface.co/docs/diffusers/index

Andrej Karpathy's Neural Networks: Zero to Hero. He goes from the basics up to how GPT etc, so you can start wherever suits you (https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThs...)

3blue1brown's videos. I've found all his videos on neural networks and math worth watching, even for stuff that I already know, he sometimes has some new perspectives and nice animations.

brilliant.org. Nice math refresher and the courses there are almost like fun little games.


👤 quibono
OP here.

For some context, something I should have mentioned in the original post but failed to do: I was not intending to do a professional pivot to an AI role; it is more of a personal interest. I used to be really excited about this stuff and am looking forward to getting involved in it again just because I find it interesting.

Thank you, I really appreciate everyone's responses.


👤 anonreeeeplor
I think you should be careful about dropping whatever you are doing and running back to this new iteration of AI.

Quite honestly, the opportunity all seems to be on the front end. The idea that you are going to airdrop yourself as a hands on AI programmer into this market doesn’t make a huge amount of sense to me from a career perspective.

The opportunity is with the tools and how they are applied. Building front end experiences on ChatGPT and integrations and applied scenarios.

You actually doing the AI yourself means competing with PHDs and elite academics immersed in the field.

I think knowledge of AI is far less valuable than knowledge of the emerging landscape combined with a broad understanding of different tools and how they are applied.

The new trend here is very strongly Large Language Models (LLM). You should be far more specific with what your goal is and where to spend your time.

A lot of the “AI” you are referring to seems to be no longer relevent or interesting to the market.

If you are spending time with Jupiter notebooks I would say you are probably completely wasting your time and heading in the wrong direction.

LLM is the major trend. Focus entirely on that and the tools landscape and how to integrate it and apply it. It feels like you are navigating using an out of date map.


👤 turkeygizzard
Similar situation as you. I stopped keeping up around 2015ish.

I just read Francois Chollet's Deep Learning with Python and found it to be a fantastic high level overview of all the recent progress. There's some code, but not a lot. I mostly just appreciated it as a very straightforward plain-language treatment of RNNs, CNNs, and transformers.

Now I'm going through Stanford's CS224 lectures.

I'm sort of planning to read papers but as some other comments have pointed out, I'm less sure of the ROI on that since I'm not sure how feasible a future in AI is for me


👤 axpy906
Following. I got off the train about two years ago to work more in engineering. The way I see it, if you’re not a research scientist - this field is best addressed as an ML engineer as there are more challenges in systems. Would love to be proved wrong.

👤 ineptech
Possibly too basic for you, but I was hugely impressed with https://www.nlpdemystified.org/course (found on HN a week or two back). Each chapter has a large jupyter notebook with lots of annotated sample code.

Also there's no cost, no book to buy, no email signup, it's just a guy sharing knowledge like the old days. Great course.


👤 jstx1
> I am comfortable with autograd/computation graphs, PyTorch, "classic" neural nets and ones used for vision-type applications, as well as the basics of Transformer networks (I've trained a few smaller ones myself) and RNNs.

It sounds like you're into it already. And you already know which new papers are interesting to you.


👤 bobleeswagger
Maybe it's just me, but it sounds like you're only interested in getting back into AI because it is currently in the limelight. "Working in AI" a few years ago doesn't mean much if you have nothing to show for it. What I'm getting at, is your motivations don't seem genuine; there's nothing that tells me you care about the technology advancing more than a paycheck.

As someone with 10 years of professional experience in software, I find every AI "trend" that has come up in that time to be incredibly odd. It is certainly remarkable what chatGPT, StableDiffusion, and other examples are doing today... Ultimately people are giving waaaaaaaay too much credit without understanding the technical details. These are pidgeon-holed examples that still aren't solving any real problems.

AI is still just statistics with marketing.


👤 sigmoid10
To be honest, for transformers just go to huggingface.co and see what interests you. They have tons of examples to run and they also link to all the papers in the documentation. It doesn't get much easier to get into it. Even for the more recent stuff like vision transformers and diffusion models.

👤 mythhouse
How are you software engineering skills. Thats the biggest gap i currently see at my current employer. Way too many data scientists not able to make an impact because they are not able to put their notebooks into a product and run it in prodction.

👤 f0e4c2f7
This should jog your memory and catch you up with the latest and greatest.

https://course.fast.ai


👤 mathgladiator
I'm curious what you discover as I did some AI decades ago, and now I have a new AI problem. I'm trying to research how to build a generic self learning board game agent for my platform ( https://www.adama-platform.com/ ) as I've reduced the game flow to a decision problem. How I intend to start is to experiment with simple stuff at a low level, and then use that experience to find out what to buy.

👤 ttul
I signed up for Jeremy Howard’s second AI course at the University of Queensland. The course lectures were streamed live to participants around the world in October and November. An online forum organizes everything.

When this course becomes public next year, I think it will be a great way to get caught up. In the meantime, you might still be able to pay the AU $500 fee and watch the course content, which was all recorded, if you are anxious to get going.


👤 sjkoelle
Personally I think that contributing to an open source community is the move. Join the Eleuther Discord. Futz around on Hugging Face. Play with notebooks on Uberduck. Have fun!!! Gatekeeping is dumb.

👤 whiplash451
The latest development in AI are very cool, but a big lure in my opinion.

You have two options:

1. Work full-time for companies doing this state-of-the-art stuff (OpenAI, Meta, etc.)

2. Work full-time for a (good) AI company that is doing interesting AI work, but most likely not based on GPT/SD/etc.

In both cases, you will learn a lot. Anything else seems like a costly and dangerous distraction to me.


👤 stephc_int13
Read all the leading papers, many times, to get a deep understanding, the writing quality is usually pretty low, but the information density can be very high, you'll probably miss the important details the first time.

Most medium and low-quality papers are full of errors and noise, but you can still learn from them.

Get your hands dirty with real code.

I would take a look at those:

https://github.com/geohot/tinygrad

https://github.com/ggerganov/whisper.cpp


👤 Simon_O_Rourke
If you're something of a snakeoil salesman with some, but not deep, technical knowledge, then AI Ethics is for you. There are (or were) companies out there who would pay big bucks for folks to tell them some models potentially could be biased or otherwise discriminatory because of poorly selected data. Wash, rinse, repeat and see your paychecks roll in.

👤 marban
Related question, what's the current state of paraphrasing text with off-the-shelf Python libs — PyTorch/Transformers?

👤 optbuild
> I am comfortable with autograd/computation graphs, PyTorch, "classic" neural nets and ones used for vision-type applications, as well as the basics of Transformer networks (I've trained a few smaller ones myself) and RNNs.

Maybe I am a bit off the track. But how do someone reach this state?


👤 matchagaucho
Similarly, my linear regression and decision tree ML skills are feeling really outdated :)

But understanding AI fundamentals gives me a fresh perspective on how to build applications that leverage ChatGPT (for example).

Crafting the inputs to achieve desired outputs. Training the models with a corpus of data relevant to a niche industry, etc...


👤 CoastalCoder
What's your goal?

For example, do you want to develop models as a hobby? Make models or software for a living? Use AI in some particular problem domain?


👤 marban

👤 yonz
In the same spirit, what are the three learning algorithms to implement for learning? My to-do list has:

1) Refresher: RNNs, deep net classifier

2) LSTMS

3) Self Attention anything

Any other suggestions for getting back in the loop?


👤 usgroup
Do you guys expect DL to have longevity generally speaking, or is the usurper already on the horizon?

There are a few non equivalent universal approximation approaches. I’m not sure I fully understand why this is will end up being the one even on a 10 year horizon.


👤 alexmolas
For diffusion models I recommend you this blog post https://eugeneyan.com/writing/text-to-image/

👤 29athrowaway
I recommend this video from Andrew Ng: https://www.youtube.com/watch?v=avoijDORAlc

👤 greggarious
Find the paper that got you into that (or anything else) then see who's cited it since and catch up that way.

👤 zone411
Start with survey papers. Starting with GPT is jumping in the middle. It's better to know the big picture first.

👤 uoaei
AI is so much more than the eye-catching generative art experiments.

👤 nektro
don't