How to Break into AI Engineering

Question

What are some great resources for learning the skills and knowledge necessary to start a career as an AI engineer?Gracias!

__rito__ · Accepted Answer

- Have crystal clear Mathematical foundations, as in why this formula/method the way it is, rather than being able to solve college/HS test problems. Really solid footing in Differential Calculus and Linear Algebra is necessary.
- Know the Statistical language that you learn from a basic college-level Stat 101 course. Be able to translate normal sentences into those using Statistical notation, and be able to read easily. Also, know basic Statistics.
- You already know programming, I assume. Learn Python if you don't know already. It's really easy.
- There are a number of paths you can go from there. Here's what I did.
-- IBM Data Science Professional Certificate (not deep at all, but lays out the landscape well; did it in a week)
-- Machine Learning for Absolute Beginners by Oliver Theobald which you can finish in an evening.
-- Machine Learning Specialization by Andrew Ng on Coursera.
-- Deep Learning Specialization by Andrew Ng on Coursera.
-- fast.ai course.
- Learn PyTorch really well. I suggest Sebastian Raschka's book.
Now from here, you can chart your own path. You can choose NLProc, Vision, RL, or something else.
I went towards Vision. And I do Edge AI as hobby.
I was in the last year of college as a Physics undergrad, when I was hired to do Vision modelling/research for a non-flashy company in 2021. Finishing my CS Master's next month and starting to look for PhD. I worked in the same company for the ~2.5 years.
EDIT: If you want a job in big tech, grind Leetcode, and learn about system design, study Machine Learning systems, and be able to design them. Chip Huyen has a good book as I hear. 6-7 rounds of interview is common in Meta/Google. DL hackathon awards, open source contributions are significantly helpful.

santiagobasulto · Answer

AI Engineering is basically Data Engineering focused on AI. When in "traditional" Data Engineering you create pipelines that store processed data in something like a Data Lake, in AI Eng. your end storage might be a specialized Feature Storage (like Feast or GCP Vertex AI).
There are some AI Engineers with strong scientific/mathematical background, but that's rare. Usually, you're paired with these ML people that actually develop and evaluate the models.
So my advice is to start with Data Engineering and then find a specialization AI. You should have a VERY solid foundation on scripting and programming, specially Python. Also, a lot of concepts of "data wrangling". Understanding how data flows from point A to point B, how the intermediate storages and streaming engines work, etc. Functional programming is key here.
[0] https://github.com/feast-dev/feast

loudmax · Answer

Check out the Huggingface NLP course: https://huggingface.co/learn/nlp-course/chapter1/1Huggingface has a bunch of courses, but that's a good one to start with. You can do the exercises on your own computer or on a cloud server if you want access to a more powerful GPU. If you go through these courses and pay attention you'll be in a really good position.

RcouF1uZ4gsC · Answer

Maybe a little bit of a contrarian idea, but I would be really leery of trying to become an "AI engineer" now. There is a possibility that we are at the apex of this cycle of AI (if you look at history, you will see AI goes in cycles), and we will run into more and more limitations.
Instead, of targeting AI engineering, I would focus on obtaining a solid mathematics background (calculus, linear algebra, discrete mathematics) and a solid computer science background (algorithms, data structures, distributed systems, databases/data storage/data retrieval).
Then with those skills, you can easily become a "SW Engineer who leverages AI" which in my guess will be a much better job and more stable than "AI Engineer"

tikkun · Answer

I wrote something a bit ago to answer this question: https://llm-utils.org/AI+Learning+CurationIt was previously popular on HN though only inside a comment thread, and I haven't submitted it as a link post yet.

nikhizzle · Answer

Assuming you have the math and algorithmic background, I would start by reading the &ldquo;attention is all you need&rdquo; paper. After reading, attempt to build a baby transformer model in PyTorch. After that, consider constructing some of the building blocks without libraries to understand how they work.

the_burner_acct · Answer

https://a16z.com/2023/05/25/ai-canon/Covers a lot of ground

ukuina · Answer

I would recommend AI Applications Engineering (e.g., applying LLMs as a unit of compute).Start with reading AI Canon and setting up projects like PrivateGPT and AutoGPT locally, then working with LocalAI to serve up HuggingFace models in place of OpenAI models.

electrondood · Answer

Can you elaborate on what "AI engineer" means to you?

thatsadude · Answer

AI Engineering is broad. There are a lot of things to learn and a lot of mistakes you have to made yourself.For applied ML, my tips are: make sure you learn the dark side of BatchNorm and Dropout, start with simple and elegant baselines instead of complex SOTA algorithms, spend more time on understanding your data than trying algorithms, be aware that SOTA in a related task will often suck at your task, be data driven. Also, most of your ideas will not work but you have to try and conduct experiments carefully.

usgroup · Answer

I think studying what happened to the &ldquo;data engineer&rdquo; role is a good indicator.There was a brief moment in time wherein data engineers were computer scientists specialised in distributed systems and data processing algorithms on commodity hardware. You had to know a lot on average.Then came commoditisation via the big vendors and now you really don&rsquo;t need to know very much. As a result it is not uncommon to meet &ldquo;senior&rdquo; data engineers who mostly script Python, do SQL and configure Airflow.I think ML ala AI has already gone that way and many vendors are strongly promoting developer participation with courses and plug-and-play resources.So what do you need? A vendor certificate you took over a weekend, and an employer to say &ldquo;yes&rdquo;&hellip;

f38zf5vdt · Answer

1. Log onto OpenAI2. Prompt ChatGPT for "how do I make a pytorch program that learns to do "3. Ask ChatGPT for code and for it to explain the code4. Run the code on Google colab, if it doesn't work, ask ChatGPT why and keep rerunning it until it does5. If you find some API that's too new for ChatGPT to know about, just paste in the API documentation and then ask ChatGPT to propose some code using itChatGPT is wrong a lot, but if you keep badgering it, eventually you will get a solution that works. It's like having a tutor standing next to you that you can ask questions to, I can't think of a better way to learn even if it's wrong on occasion.

devaler · Answer

Learn everything you can about cleaning/standardizing datasets and the boots-on-the-ground process of labeling data and training algorithms. These are fundamental, and are often overlooked and undervalued.

markus_zhang · Answer

I'm surprised that many recommend heavy Math, but I always thought AI engineer is sort of specialized data engineer for AI projects.
It's definitely a plus to know a bit about the Mathematics, but I doubt anything short of a Master of Science in Math with AI as specialization is going to close the gap. How many data engineers can do that?
Wouldn't it make a lot of sense to hire someone with zero AI exposure but tons of experience in sysadmin, data engineering and ops? It's going to be tough to find someone who are both an engineer and a Math wizard, I think.

71a54xd · Answer

Has anyone here transitioned into an ai dev adjacent role, akin to something a bit more involved than "prompt engineering" potentially the "product eng" equivalent of AI?

rch · Answer

I've been talking up the idea of finally settling on a PE certification for software engineers (in the U.S.). It seems like most of the risks and responsibilities being discussed in the context of government regulations could be addressed with mechanisms similar to what we rely on for aircraft, bridges, power plants, etc. -- all those areas have processional credentialing in addition to bureaucratic oversight.

achileas · Answer

Can you clarify what you mean by "AI engineering?" There are two main paths right now - this is from my experience as a software engineer overall for a decade, a data engineer of some form for all but a year and a half of that, and a DE/MLE working on AI R&D teams for the last 5.5 years.
1. MLE/DE/MLOps - this is more like typical software engineering. You're responsible for building data platforms, tools, monitoring, and more around the model development lifecycle. This can include: data ingestion, data architecture, data transformation and storage, automating and productionizing various workflows like training, evaluation, and deployment, monitoring deployed models, data monitoring (and building monitoring), tooling like feature stores (and libraries for R&D teams to interact with them) or internal deep learning frameworks, etc. You'll basically work as a part (or an adjunct to) the research team that is testing new model architectures, different approaches towards some goal, etc. These are largely taken from my own experiences and projects I've built. Skills: software engineering, Python, knowledge of the model development lifecycle, data architecture/engineering, some knowledge about the frameworks used, cloud platforms, etc. Designing ML Systems by Chip Huyen is a great overview of all of this kind of work.
2. Research. This is actually building models, implementing papers, very occasionally (especially in big companies) doing publishable research. This is more akin to academic work (my educational background is in hard science academia), and requires a lot of paper reading, experimentation, etc. It will require knowledge of your niche (I mostly work with CV teams, for instance), strong math fundamentals, and very often a PhD.
I can tell you how I, as a self-taught software engineer with a bio education got here. My first job was a generic enterprise desktop application development role, randomly joined a data engineering team shortly after that not even knowing what DE was, but knowing I liked it. We worked on a massive distributed ETL system. I then joined my first startup, it was also a DE role, but we were a small group in a larger research team where I got my initial exposure to ML workflows and especially moving them to the cloud. We did some simple model training, data management, and building products around the models we built while also supporting the research efforts of the larger team.
I then went to another startup, where I had the sole responsibility of our research infrastructure (largely based on the strength of my knowledge of AWS and Python). I was the sole engineer on a team of CV researchers, and did things like automate their entire evaluation workflow and move it to the cloud, worked on the internal deep learning framework, and built a team to evaluate the current AI development lifecycle and design a platform to harden and optimize the process. Covid put the kibosh on that. I moved to another, earlier startup, doing similar work but more foundational - almost everything was built from scratch.

martingalex2 · Answer

When does the AI do the AI engineering?

hackernoteng · Answer

Why? Do you like to make software that barely works? Do you like to make and sell software which is dishonest in its capabilities? In traditional software, 10% error rate is unacceptable - it can't be released. But hey 10% error rate in AI is good! No, it isn't.

How to Break into AI Engineering

What are some great resources for learning the skills and knowledge necessary to start a career as an AI engineer?
Gracias!

I wrote something a bit ago to answer this question: https://llm-utils.org/AI+Learning+Curation
It was previously popular on HN though only inside a comment thread, and I haven't submitted it as a link post yet.

Assuming you have the math and algorithmic background, I would start by reading the “attention is all you need” paper. After reading, attempt to build a baby transformer model in PyTorch. After that, consider constructing some of the building blocks without libraries to understand how they work.

https://a16z.com/2023/05/25/ai-canon/
Covers a lot of ground

I would recommend AI Applications Engineering (e.g., applying LLMs as a unit of compute).
Start with reading AI Canon and setting up projects like PrivateGPT and AutoGPT locally, then working with LocalAI to serve up HuggingFace models in place of OpenAI models.

Can you elaborate on what "AI engineer" means to you?

Learn everything you can about cleaning/standardizing datasets and the boots-on-the-ground process of labeling data and training algorithms. These are fundamental, and are often overlooked and undervalued.

Has anyone here transitioned into an ai dev adjacent role, akin to something a bit more involved than "prompt engineering" potentially the "product eng" equivalent of AI?

When does the AI do the AI engineering?

Why? Do you like to make software that barely works? Do you like to make and sell software which is dishonest in its capabilities? In traditional software, 10% error rate is unacceptable - it can't be released. But hey 10% error rate in AI is good! No, it isn't.