Langchain is poorly documented and has all sorts of problems.
LMQL seems to be up-to-date but it's not guaranteed to work (e.g., one of their examples is `from lmql.lib.chat import message` but then Python throws `lmql.lib.chat` not found).
LLM (CLI tool) is mostly for, well, TUI apps.
AIchat is abandoned.
Lots of projects around Langchain are also abandoned.
Does this all mean that the industry came to its senses and realized LLMs are just API calls all the way down? Or is it because each firm now develops its own LLM stack?
these libraries make too many choices for the user given the variability, and generally make the most interesting use-cases more difficult. It's like the failed attempts to create a single API for all the clouds, it is impossible. In the long run, they will all get in the way.
What is it about LLMs that makes people think we need something like langchain? A function or package for dealing with vector embeddings, ok, but to wrap API calls... I can write that function myself, I don't need their subpar interfaces for messages and all the things...
Developers want building blocks for LLMs, not a bunch of frameworks competing to be the one that wins by pushing out shitty implementations as fast as possible
What I’m talking about is the need to do
https://scikit-learn.org/stable/model_selection.html
which is not just about training one model but also about evaluating at least one model and usually generating quite a few models with different parameters the latter of which is more essential than ever now that we don’t really do early stopping with neural networks anymore. (Back in the day I built products that could reliably build a useful model in one try with early stopping, which made fine-tuning LLMs so nerve-wracking to me)
For instance the tools I mention above in scikit-learn work great if I am training models that take a few minutes to train on one machine. I have a set of sk-learn models that I train and use every day.
I’ve tried fine-tuning models from huggingface to do the same task, on a good day they perform about as well but take 30 minutes to train instead of 30 seconds. I really ought to be able to run both kinds of models in the same “model selection” framework and pick the best, but boy is it a hassle because it not just about model selection it is about the whole data handling and training process which runs inside the model selection process. For my problems I don’t really like the way huggingface handles data as much as sk-learn/pandas/bumpy and don’t like the model selection facilities in huggingface.
I worked at a place where our model trainer would train foundation models (a few days on a specialized machine, could just as easily have been a swarm of cloud GPU instances) and also train models that incorporated foundation models (a few minutes to train.). We developed our own model selection facilities that worked in the large and at the time I didn’t appreciate the facilities in skl, what we built wasn’t that good and I take 100% of the blame that can attributed in our company to that but the fact that every other framework in the industry sucks (ok, i like SKL for small problems but for large problems it just doesn’t show up)
I’ve built my own code which approximates some of what langchain does, I wouldn’t say it “sucks” the way langchain does but it is very simple, very targeted at one problem, if I tried to grow it up like langchain I’d be swearing at it to.
This discussion really caught my eye because it shows how LLM training is still pre-paradigmatic. I’ve seen this problem and thought I was alone
https://news.ycombinator.com/item?id=37399873
but now I know I am not. Situations like that leave me, as an engineer of practical solutions, feeling really out of control, but academics don’t seem to mind so much because they just need numbers for their conference paper. What gets me is that so many people were seeing this problem and not writing up although it’s pretty deeply disturbing and makes it look like we are training, training and training our models and most of it is wasted work.
My money would be on this. Custom LLM frameworks are relatively easy to build for experienced devs, and probably less work than adapting an off-the-shelf one, so making a custom framework is a no-brainer.
Anyway, I think LangChain and similar projects have been mainly useful as proofs-of-concept and by creating awareness.