HACKER Q&A
📣 devstein

Best Embedding Models?


Hey HN, which embedding models are people using? There has been so much development around foundational LLMs, but haven't seen much news about embedding models.


  👤 rapatel0 Accepted Answer ✓
I've liked qwen and embeddinggemma for local search. Qwen because 32K is enough to basically fit a whole page into the context window and embeddiggemma because it's crazy efficient.

👤 jayshah5696
embeddings are easy to fine tune. Try modern bert.

👤 PhilippGille
Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:

https://huggingface.co/spaces/mteb/leaderboard


👤 didgeoridoo
I’m partial to jina.ai — they have open models for code and prose, all easily runnable locally.

👤 LogicCraft678
Feels like embeddings are underrated compared to LLM's hype, but they doing great.

👤 halvorbuilds
gemma4

👤 frederickabrah
who knows a tool for rug check in crypto

👤 emschwartz
I’ve been using MixedBread, which is a pretty old model at this point. Recently, I tried comparing it to some newer models and was disappointed that the results weren’t dramatically and uniformly better.

You probably can’t go wrong if you pick a recent one that scores decently well on benchmarks and is at the right price point (or memory requirement) for whatever you’re trying to do.