Is there a secret sauce for Chat with Structured Data applications?
In the past few months, we've seen the raising of LLM-powered Chat with Data companies, where most of them are only a thin wrapper arround ChatGPT/LangChain. However, I've realized that whereas the recipe for building Chat with Unstructured Data applications is very well-known and widely publicized, up to a point of drawning renewed attention to the dense retrieval problem and fostering the creation of mult-million dollar vector database companies, such as PineCone and Chroma, Chatting with Structured Data has not recieved the same level of attention, even though there are a few small companies launching applications (on the top of LangChain's csv agent or LLaMA Index query engine, I presume) to allow chatting with csvs and sql databases . I was wondering if anyone here at HN has worked on similar applications and would be willing to share the knowledge obtained on that. What is the secret sauce for Chat with Structured Data applications? Do these companies this kind of application have any kind of moat besides good prompts and the fact this niche has been less explored than chatting with unstructured data? Are there any recommendations or good practices of how to take the most of chatting with structured data or even how to adapt LLMs to make them prominent in answering questions about a slice of data?
I think that the reason this kind of application isn’t popular is because they do not exist, it is that simple! It is impossible to create a reliable data analytic, it is really unpredictable how to better handle the data. Microsoft promised office365 with Copilot and look where we are right now.