HACKER Q&A
📣 version_five

Do You Trust HuggingFace?


They are a for profit company with lot of VC money that is cornering the market as a repository for open source (or freely available) ML models. They also now have an Arxiv like repository of ML papers, plus their own Transformers, Diffusers, Accelerate, other? frameworks... it's too good to be true and it doesn't sit well with me. Do they have a clear way of making money that doesn't end with them screwing us? Until I understand what they're up to I don't feel very comfortable relying on them.


  👤 armchairhacker Accepted Answer ✓
They make money from their cloud services: storage, GPUs, dataset hosting, model training, model running, code sharing, model distribution. They're a one-stop shop for nearly any ML cloud solutions you'd need.

The transformers library and other products boost their image, making more people know and trust them, making more people pay to use their cloud services.

That's not to say they won't screw us if we start trusting them too much and relying on them with no alternatives. Or that they'll ruin a good thing in order to make even more money.

But I don't really see how they can screw us right now. All the code they release is open-source (transformers under Apache 2.0), and they data they provide isn't theirs. They're the de-facto standard in every academic and open-source setting I'm familiar with; it doesn't mean these people need to use huggingface, these people want to use them. And, they have a way to make money (cloud services) which not only should be profitable on its own, but encourages them to be charitable and keep good reputation to get and retain customers.


👤 brucethemoose2
What even is the enshittification potential?

Put models behind subscriptions? Eh, doesn't seem that lucrative.

More inference/training UIs and APIs? Thats not a bad thing at all.

The HF libraries are Apache 2.0, so whatever they do to those can be forked away.

Maybe a "business" tier?


👤 PaulHoule
The notable thing missing from their platform is a sane way to privately manage models that you create, in fact when you look at the docs it is all cart-before-the-horse, as soon as you train a model they “nudge” you to upload it to a public repository, but details on how to do inference (simple things like truncating the tokens before doing inference in a pipeline) are not in the docs.

I am all for uploading a model to them but boy am I going to want to kick the tires using it for real first. It’s clear a paid product for them would involve quite a few feature you’ll need once you are using models to make decisions that aren’t there.


👤 rektide
I don't understand HuggingFace's model at all. I went to try out StarCoder LLM & happily downloaded the git repo. Installed some pre-reqs. But the next step was to create a Huggingface account, to use this locally? I noped the heck out.

I came back latter to try to understand what was going on, why I'd need an account. I could find nothing helpful to tell me what was going on. I have never seen a more nebulous relationship with a company. I could be missing lots of hints, but I tried real hard to understand what's happening here & all of it felt extremely suspicious.


👤 garbagecoder
Probably will enshittify at some point. Enjoy it while it lasts.