Best Embedding Models?

Question

Hey HN, which embedding models are people using? There has been so much development around foundational LLMs, but haven't seen much news about embedding models.

rapatel0 · Accepted Answer

I've liked qwen and embeddinggemma for local search. Qwen because 32K is enough to basically fit a whole page into the context window and embeddiggemma because it's crazy efficient.

jayshah5696 · Answer

embeddings are easy to fine tune. Try modern bert.

PhilippGille · Answer

Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:https://huggingface.co/spaces/mteb/leaderboard

didgeoridoo · Answer

I&rsquo;m partial to jina.ai &mdash; they have open models for code and prose, all easily runnable locally.

LogicCraft678 · Answer

Feels like embeddings are underrated compared to LLM's hype, but they doing great.

halvorbuilds · Answer

gemma4

frederickabrah · Answer

who knows a tool for rug check in crypto

emschwartz · Answer

I&rsquo;ve been using MixedBread, which is a pretty old model at this point. Recently, I tried comparing it to some newer models and was disappointed that the results weren&rsquo;t dramatically and uniformly better.You probably can&rsquo;t go wrong if you pick a recent one that scores decently well on benchmarks and is at the right price point (or memory requirement) for whatever you&rsquo;re trying to do.