HACKER Q&A
📣 flownoon2

Does your work involve using data to make falsifiable predictions?


I've been a Data Scientist in some form for nearly 15 years, as a PhD student as well as at various industry and NGO jobs. My work post-PhD has been with various DaaS start ups that try to sell some sort of data driven "insight" to customers, which honestly involves a lot of iterating on making results "look right".

Recently I read a great analysis [1] of the climate analytics industry that really cast a lot of the work I've ever done in a new light. There are lots of start ups selling forecasts about the future risks of hurricanes, floods, wildfires, etc, for the year 2050 or even 2100. The issue is that there is no way for a customer or even a data producer to know the accuracy of those forecasts, at least not for another 30-80 years. Other industries I've worked in have similar problems of selling predictions that can't be readily tested. In such a context, there aren't really incentives for the business to invest in good data science, and there isn't even a way for a data scientist to _know_ if they're doing good data science.

Realizing that a lot of my career I've been pretty much a glorified used car salesman has really changed the way I see a lot of my work, and I'm considering a career switch.

Do any of you work in jobs that involve making falsifiable forecasts, and how do you like your jobs? Is there anything out there besides finance or optimizing for clicks on ads?

[1] https://www.crucialab.net/post/why-no-climate-warranties/


  👤 mhrmsn Accepted Answer ✓
I work in data science for a pharma/chemical company. Broadly speaking, our team is applying machine learning to chemistry and biology-related problems. Those are usually falsifiable and can and will be validated through lab experiments.

The main problem here is that experiments tend to be expensive - depending on the problem a single data point can easily cost from $100+ (sample preparation and measurements) to $100k+ (e.g. synthesis of a new compound). So our datasets are often small, and there is some barrier for lab colleagues to trust/try out some new ML model vs their status quo.

But it is quite rewarding when it works and one also gets to interact with people from different disciplines on all sorts of interesting problems :)


👤 nequo
I have no personal experience with this in industry. I assume that any job where forecasts can be tested in a relatively tight feedback loop could be good candidates. Examples: supply chain management, insurance, pricing on online platforms.