If, for example, you're a freemium web app where each user request takes 100ms of inference time, you may have trouble turning a profit. Even something as basic as a translation tool (à la Google Translate) seems tricky for a startup to run at a profit. Larger companies presumably have the flexibility to treat some of these things as loss leaders, whereas a startup with ML as its core product likely needs to be more conscious about cost.
What are some ways that you've seen companies cut down on compute costs for machine learning services?