HACKER Q&A
📣 gautamsomani

Should even small scripts have documentation?


I have recently joined the biggest FinTech of my country as a Lead SRE. We have 80+ SREs in our team. Company is 6 years old. Been 3 weeks, and as I am going through the setup, infra and workflows, I am coming across lot of legacy perl scripts. (I personally hate perl since 1. people really do similar things in different ways and 2. I don't want to learn or even comprehend all the ways).

The point of this post is, that these scripts, some of them written more than 3 years back and still running even on modern infra (since they do their job well), have no documentation. To understand how they are working, I have to spend lot of time reading them and co-relating them with other scripts where many scripts work in tandem to accomplish a major task.

Now note, some of these scripts are just 100 lines. But very cryptic 100 lines. Which made me think that when they were written by my (former/current) peers who are good in perl, it was all Okay for them, but since these scripts have survived and don't know how long will they survive, should there be a documentation for each of them? Like explaining what they do and give a high level overview of how they do it? Or would it be too much?

This question doesn't just apply to perl scripts, but to any script a SysAdmin/SRE/DevOps/Dev may write, no matter how small or trivial. What would be the best thing to do here? It may sound a bit of spoon-feeding for the future generation, even to senior folks who join later and who are expected to do things hardway (may be?) but will save time and help build a good understanding of systems and workflows quite fast.

If there is someone who does that, or an org where this culture is practiced, please tell me about it. If someone is against it, please share your views too. Please share all the best practices as I intend to implement that in my org.

NOTE: In coming few months I will be writing a small but major system which will replace a few of these scripts with new code but will retain some of the same functionality. I do intend to write comments in each function block and also a few README files along with architecture overview and diagrams. But want to know to what extend should I go to make my work kinda future-proof.


  👤 prirun Accepted Answer ✓
I'm the sole author and developer of HashBackup, written in Python. I just checked, out of curiosity, and 18% of HashBackup's non-blank lines are comment lines. Considering Perl is usually more dense than Python, I'd say your "very cryptic 100 lines" of Perl should have at least 20 comment lines, and probably more, for me to feel comfortable with them.

I know some people have a philosophy of "ignore the comments, just read/trust the code", but it is such a waste of time to revisit code 5 years later and have to remember or re-learn all of the circumstances and edge cases that lead to the decisions in this concrete implementation. This is true just for myself, having written the code (!), and would be especially true on a team of 80 SWE's: it's more likely you didn't write the code you're looking at, or multiple people have mucked with it, and everyone who has to modify it in the future is going to have to go through this re-learning process to some degree before changing it. Do them a favor and comment everything that is committed to a repo.


👤 geocrasher
I write bash scripts almost solely for web hosting automation. I don't write a lot but when I need something, I spend a few hours making it. And when I'm done, I document it, because it's quite likely that in several months (or years?) I won't remember where I found that snippet of code, or what that function does, or how many bats ears and tree knots I had to boil to get that regex to work, or even what he regex does. So, I put comments throughout. It also saves my coworkers time when they're trying to figure out the script.

So yeah, leave comments! They don't have to be long. I leave comments like

  # at this point we know $daemon is dead, so remove socket and kill any stray processes, and restart 

  #IP is validated as being a proper IP, now lets make sure it's not on the server itself. You're welcome, Scott.
So simple, but explains what was going on in my head when I wrote it.

👤 janosdebugs
Yes, especially amall scripts should have documentation. The requirement to put some thought into them prevents their proliferation. Utilities like shellcheck in the CI prevent from quick and dirty stuff making it into the repo. Last but not least, the -h option working and outputting something reasonably easy to understand will stop the next person coming after you from hating your guts. :)

👤 sloaken
IMHO any code that is not documented is "Throw away". The time it takes to understand what is does, how it does it, and its limits, usually exceed the time to rewrite with documentation. Further the company that has a lot of undocumented code is a large red flag.

👤 h2odragon
When you're looking through this stuff with fresh eyes, as you learn what it's doing, is a great time to write up your appreciation of its function. This may not be "documentation" but when there's nothing else, every little bit helps.

👤 JoeyBananas
It really depends. This is a low-level decision that is best made by someone who understands the goals of the company. That may not be a satisfying answer but it's true.

👤 sys_64738
If you can't describe in a few sentences what a script does that you plan to write then you have no clue what you're trying to do.

👤 themodelplumber
If you do this you will be adding a new layer of expression to the existing pile. A layer that allows you to express your preferred working perspective, but still, the other layers came with working philosophies too.

Expressive and unique code often expresses a set of general workplace-system values. It's like a T-shirt with an obscure slogan on it. You propose to add an explanatory note to the T-shirt, in a way. To some, that's helpful. To others it's a metaphorical marketing disaster. (You explained what, to who?? Man, don't do that for them, they don't deserve it and don't know what they are doing.)

In those shoes, I would ask myself if the current leadership and philosophy of the inner-company has really changed since Perl times and if you have buy-in to take control. Or if you will be packaging control systems for people up the chain who will take that control and use it to wield _you_ more directly. (Perl in such a state as you found it was a great equalizer in those cases. It helped raise the level of technical discourse to where we are at now, in a big way, by allowing this unique prose as code. You want a seat at the table, earn it with technical chops, and this will be future-proofing in that sense)

From a peer perspective, what your documentation indicates to code-expressive personalities (people) is that you intend to focus on broad, systems-level control. This can be really scary to them and maybe also laughably grandiose, and can cause reactions you don't want. So I couldn't tell if you are working with anybody like that, but if so it's something to consider.

You should at the very least take stock of who else documents, and who doesn't.

Overall: If little documentation was done, and you are heavily pro-doc, you need to consider that this is a complete shift in philosophy in some deep ways. If you are the variable here, yikes, look out, you could be giving your peace of mind or maybe your job away in exchange for equity that is not really there waiting for you.

But if there's some other variable like new ownership, maybe not so much. Maybe broader systems-control is needed of you at some more primal level, e.g. it stands between you and the literal destruction of the org.

Future-proofing is also great until you meet the next engineer who is fascinated by mainly details and who literally perceives the future as changing with every new discussion upstairs. So, which perspective does your organization reward and prefer? That's a good cue for a thoughtful or nuanced approach to the future.

There are also different levels of documentation, such as documenting for yourself but not so much for others, and I think this is worth considering in lots of situations.

Just some thoughts and good luck.

P.S. It can be tempting to conflate being someone who practices good documentation with being a good and helpful person. This may be true in some cases, but it's also true that many organizations are long past the point at which this thought model was useful in isolation.