Collaborating with nontechnical people is oddly my favorite part of doing MLE work right now. It wasn't the case when I did basic web/db stuff. They see me as a magician. I see them as voodoo priests and priestesses. When we get something trained up and forecasting that we both like, it's super fulfilling. I think for both sides.
Most of my modeling is healthcare related. I tease insights out of a monstrous data lake of claims, Rx, doctor notes, vital signs, diagnostic imagery, etc. What is also monstrous is how accessible this information is. HIPAA my left foot.
Since you seemed to be asking about the temporal realities, it's about 3 hours of meetings a week, probably another 3 doing task grooming/preparatory stuff, fixing some ETL problem, or doing a one-off query for the business, the rest is swimming around in the data trying to find a slight edge to forecast something that surprised us for a $million or two using our historical snapshots. It's like playing wheres waldo with math. And the waldo scene ends up being about 50TB or so in size. :D
The other half is spent doing tech support for the bunch of recently hired "AI scientists" who can barely code, and who spend their days copy/pasting stuff into various chatbot services. Stuff like telling them how to install python packages and use git. They have no plan for how their work is going to fit into any sort of project we're doing, but assert that transformer models will solve all our data handling problems.
I'm considering quitting with nothing new lined up until this hype cycle blows over.
95% of the job is data cleaning, joining datasets together and feature engineering. 5% is fitting and testing models.
Environment broken
Spend 4 hours fixing python environment
pip install Pillow
Something something incorrect cpu architecture for your Macbook
Spend another 4 hours reinstalling everything from scratch after nuking every single mention of python
pip install … oh time to go home!
I wonder how “real” ML people deal with the stochastic/gradient results and people’s expectations.
If I do ordinary software work the thing either works or it doesn’t, and if it doesn’t I can explain why and hopefully fix it.
Now with ML I get asked “why did this text classifier not classify this text correctly?” and all I can say is “it was 0.004 points away to meet the threshold”, and “it didn’t meet it because of the particular choice of words or even their order” which seems to leave everyone dissatisfied.
I build the systems to support ML systems in production. As others have mentioned, this includes mostly data transformation, model training, and model serving.
Our job is also to support scientists to do their job, either by building tools or modifying existing systems.
However, looking outside, I think my company is an outlier. It seems in the industry the expectations for a ML Engineer are more aligned to what a data/applied scientist does (e.g. building and testing models). That introduces a lot of ambiguity into the expectations for each role in each company.
Highly paid motherboard troubleshooter, because those all those H100's really get hot, even with watercooling, and we have no dedicated HW guy.
Fighting misbehaving third-party deps, as everyone else.
- Collaboration with stakeholders & TPMs and analyzing data to develop hypotheses to solve business problems with high priority
- Framing business problems as ML problems and creating suitable metrics for ML models and business problems
- Building PoCs and prototypes to validate the technical feasibility of the new features and ideas
- Creating design docs for architecture and technical decisions
- Collaborating with the platform teams to set up and maintain the data pipelines based on the needs of new and exiting ML projects
- Building, deploying, and maintaining ML microservices for inference
- Writing design docs for running A/B tests and performing post-test analyses
- Setting up pipelines for retraining of ML models
Not my main work, but spending a lot of time gluing things together. Tweaking existing open source. Figuring out how to optimize resources, retraining models on different data sets. Trying to run poorly put together python code. Adding missing requirements files. Cleaning up data. Wondering what could in fact really be useful to solve with ML that hasn't been done years ago already. Browsing the prices of the newest GPUs and calculating whether that would be worth it to get one rather than renting overpriced hours off hosting providers. Reading papers until my head hurt, that is just 1 by 1, by the time I finish the abstract and glanced over a few diagrams in the middle.
Sometimes it also involves building internal tooling for our team (we are a mixed team of researchers/MLEs), to visualize the data and the inferences as again, it's a pretty niche sector and that means having to build that ourselves. That allowed me to have a lot of impact in my org as we basically have complete freedom w.r.t tooling and internal software design, and one of the tools that I built basically on a whim is now on the way to be shipped in our main products too.
I take responsibility for the end to end experience of said API, so I will do whatever gives the best value per time spent. This often has nothing to do with the ML models.
I'd argue that if you are not spending >50% of your time in model development and research then it is not a machine learning role.
I'd also say that nothing necessitates the vast majority of an ML role being about data cleaning, etc. I'd suggest that indicates that the role is de facto not a machine learning role, although it may say so on paper.
In my case, there are established processes and designated teams for cleaning & collecting data, but you still do a part of it yourself to provide guidelines. So, even though data is always a perpetual problem, I can shed off most of that boring stuff.
Ah, and of course you're not a real engineer if you don't spend at least 1-2% of your time explaining to other people (surprisingly often to a technical staff, but not ML-oriented) why doing X is a really bad idea. Or, just explaining how ML systems work with ill-fitted metaphors.
* 15% of my time in technical discussion meetings or 1:1's. Usually discussing ideas around a model, planning, or ML product support
* 40% ML development. In the early phase of the project, I'm understanding product requirements. I discuss an ML model or algorithm that might be helpful to achieve product/business goals with my team. Then I gather existing datasets from analysts and data scientists. I use those datasets to create a pipeline that results in a training and validation dataset. While I wait for the train/validation datasets to populate (could take several days or up to two weeks), I'm concurrently working on another project that's earlier or further along in its development. I'm also working on the new model (written in PyTorch), testing it out with small amounts of data to gauge its offline performance, to assess whether or not it does what I expect it to do. I sanity check it by running some manual tests using the model to populate product information. This part is more art than science because without a large scale experiment, I can only really go by the gut feel of myself and my teammates. Once the train/valid datasets have been populated, I train a model on large amounts of data, check the offline results, and tune the model or change the architecture if something doesn't look right. After offline results look decent or good, I then deploy the model to production for an experiment. Concurrently, I may be making changes to the product/infra code to prepare for the test of the new model I've built. I run the experiment and ramp up traffic slowly, and once it's at 1-5% allocation, I let it run for weeks or a month. Meanwhile, I'm observing the results and have put in alerts to monitor all relevant pipelines to ensure that the model is being trained appropriately so that my experiment results aren't altered by unexpected infra/bug/product factors that should be within my control. If the results look as expected and match my initial hypothesis, I then discuss with my team whether or not we should roll it out and if so, we launch! (Note: model development includes feature authoring, dataset preparation, analysis, creating the ML model itself, implementing product/infra code changes)
* 20% maintenance – Just because I'm developing new models doesn't mean I'm ignoring existing ones. I'm checking in on those daily to make sure they haven't degraded and resulted in unexpected performance in any way. I'm also fixing pipelines and making them more efficient.
* 15% research papers and skills – With the world of AI/ML moving so fast, I'm continually reading new research papers and testing out new technologies at home to keep up to date. It's fun for me so I don't mind it. I don't view it as a chore to keep me up-to-date.
* 10% internal research – I use this time to learn more about other products within the team or the company to see how my team can help or what technology/techniques we can borrow from them. I also use this time to write down the insights I've gained as I look back on my past 6 months/1 year of work.
the users are researchers and have deep technical knowledge of their use case. it is still a challenge to map their needs into design decisions of what they want in the end. thanks to open-source efforts, the model creation is rather straightforward. but everything around making that happen and shaping it like a tool is a ride.
especially love the ever-changing technical stack of "AI" services by major cloud providers rn. it makes mlops nothing more than a demo imho.
I worked on other ML projects as well. A system that analyzed the syslogs of dozens of computers to look for anomolies. I wrote the SQL (256 fields in the query! Most complex SQL I've ever written) that prefiltered the log data to present it to the ML algorithm. And built a server that sniffed encrypted log data we broadcast on the local network in order to gather the data in one place continuously. Another system used heart rate variability to infer stress. I helped design a smartwatch and implemented the drivers that took in HRV data from a Bluetooth chest strap. We tested the system on ourselves. None of our ML projects involved writing new ML algorithms, we just used already implemented ones off the shelf. The main work was getting the data, cleaning the data, and fine tuning or implementing new feature extractors. The CS people weren't familiar with the biological aspects (digging into Grey's Anatomy), sensors, wireless, or electronics, so I handled a lot of that. I could have done all the work, it's not hard to run ML algorithms (we often had to run them on computing clusters to get results in a reasonable amount of time, which we automated) or figure out which features are important, but then the students wouldn't have graduated :-) Getting labeled data for supervised learning is the most time consuming part. If you're doing unsupervised learning then you're at the mercy of what is in the data and you hope what you want is in there with the amount of detail you need. Interfacing with domain experts is important. Depending on the project a wide variety of different skills can be required, many not related to coding at all. You may not be responsible for them all, but it will help if you at least understand them.