HACKER Q&A
📣 Bostonian

Do you document LLM-generated code as such?


My Python source files are now a mix of functions I wrote and functions written by LLMs. I do review and test the functions written by LLMs.

When code was written by an LLM, do you note that in the code? How specific is the citation -- do you note the date the code was generated". LLM capabilities can vary over time. Do you list the prompt? (I don't but typically ask the LLM to provide a docstring.)


  👤 codingdave Accepted Answer ✓
No, but I rarely find that LLM-generated code is just pasted in as-is. It gives me some code that speeds me up, but I still tweak the details and change things to integrate it into the bigger picture. It certainly doesn't bring so much value that I'd be tracking timestamps and prompts... it is just quick boilerplate for tedious stuff that I don't want to burn time on.

👤 ilaksh
I am anticipating that within 2-5 years questions like this will be moot for many projects.

Because there will be development platforms and tools built around AI-generated code. The LLMs will have integrated code execution and libraries/APIs that they are familiar with. If a programmer or user stays within this platform, which will be fairly general purpose, the AI can handle 98% of requests with no help writing code.

These types of platforms will completely normalize AI-written code. People won't think twice about using it or feel they need to make a note of it.

What you might see say five years out is kind of the opposite. For some code bases, if a human writes the code and it hasn't been checked by an AI, then they have to make a note in a comment with their name and why they are not using AI verification. But they will probably choose to have an AI verify the code just to avoid extra procedure.


👤 sk11001
No, I use LLMs to help with code all the time and I have zero functions written entirely by an LLM. It’s more about figuring out small sections of code, or helping with design decisions, not about generating fully working functions.

I don’t document things as “I learned this from stackoverflow” either, LLMs are very similar.


👤 halosghost
I'm involved with a few projects that leverage the DCO [1]. Afaik, it's not been legally tested yet, but given all the open questions around copyright with LLMs, I assume a DCO sign-off on LLM-generated code would be a misrepresentation, and any project having received such commits would either need to rewrite those portions from-scratch (as done in other cases of infringing code) or simply remove them wholesale.

Put another way, till IP-implications have been figured out, I assume all LLM-generated code is radioactive for FLOSS projects.

[edit to clarify]: so documenting that a commit you're pushing includes code generated with an LLM will make maintainers' lives dramatically simpler. Please, and thank you!

All the best,

-HG

[1] https://developercertificate.org/


👤 philomath_mn
What would be the purpose of such citations? If you understand the function and it does what needs to do, does it matter where it came from (within legal limits ofc)?

When I read your code I want to do know what the function does and why it does it. LLM generation details would just distract from that.


👤 LabMechanic
Usually not. For me, it's an “autonomous search engine on steroids, i.e., its huge dataset”. (I.e., it's just another tool you use.)

Before LLMs, you would cobble a bunch of disjoint information via a search engine like Google. Now, LLMs do this for you, and it certainly helps me to get a lot quicker with using libraries or APIs I am not familiar with (e.g., PyGame, Flask, Django). However, you might find that code from the LLM might need some fixing (subtle bugs or redundancies) or a better use of resources.

The other issue is the LLM's dataset bias towards the most used technologies or concepts. So you might have a hard time with an LLM trying to make Clojure/Racket code or telling the LLM to specifically do the point-in-triangle test with the wedge product only.

Hence, there is still some leeway or reason to use your thing between the ears.

You might as well ask: Are you referencing Stack Overflow or the Microsoft Developer Reference (e.g., in your developer notes/comments)?

My answer: usually, yes.


👤 dankwizard
No I take all of the credit, job security.