HACKER Q&A
📣 andreyk

Have you fine-tuned LLMs to know the contents of a specific code base?


I am interested in trying to make LLMs know the contents of my project, so it can know what classes/functions/variables there are outside the current file/prompt. The first idea for "adding" knowledge of the code base (assuming it is too large to fit into the prompt) would be to fine-tune the LLM on the code. Has anyone tried this or knows of any work on it?


  👤 TroyZ Accepted Answer ✓
Fine-tuning is probably not the way to do it.

Try embedding, semantic search, retrieval, and plugging the relevant parts into the prompt.

You may need: - summarizer prompt to summarize your project structure, main functions, methods. - vector store/database to store and retrieve your relevant code from code base - coder prompt to write code based on the retrieved part.

Check out langchain: https://langchain.readthedocs.io/