[1] https://arxiv.org/abs/1907.10597 [2] https://arxiv.org/abs/1910.09700 [3] https://github.com/mlco2/impact
Anything else (other sources of emission) is simply a distraction from this simple & obvious solution.
I think there is a real blind spot in terms of the carbon emissions of ML, somewhat on the research side, but particularly on the production side. Let's take YouTube for example (note: I don't work for YouTube/Google, but I am speculating based on experience with similar complex systems).
A typical YouTube user will interact with a variety of machine learning powered systems:
- Newly uploaded videos to watch (recommendations)
- Recommendations defined by category
- Auto-play next video
- Search
- Ads
- Automated video captions
Each of these is likely a separate team or multiple teams. Each of these will be powered by one or more machine learning models as part of a complex pipeline. These ML models may analyze video content (computationally expensive), audio, user history, etc. Some of these ML models will run in real time, some of them will recompute huge datasets on a daily basis.
What's my point? Very few people have any insight into the carbon impact of all this compute. Perhaps it's negligible, perhaps it's huge, we just don't know. In the last two years, there has been a huge focus on the carbon emissions of PoW cryptocurrencies (e.g. Bitcoin). The impact is real, but I also believe crypto is an easy target. The data is public, hash rate is known and total power use is easy to estimate. The same can't be said for ML, there are a lot of hidden variables and few people are aware of the vast amount of compute that occurs to serve their favourite apps.
Final note: I'm only using YouTube as a common relatable example here. Almost every complex modern service will look similar in terms of number of ML models and downstream carbon impact.
Training is infrequent, and everyone involved cares about making training more efficient and faster.
If you care about emissions, then tax, penalize, and regulate emissions. Everyone doing ML will then look at what's costly, and adjust how they use ML models to fit their budget.
Frankly, ML is the best and most efficient way to solve many important problems. For some problems, it's the only technique that works. Trying to limit ML as though it's a special source of emissions is one of the most counterproductive lines of thinking I've ever come across.
So, no, machine learning is like at the bottom of the list in terms of concerns.