📣 _bxg1

Do you ever truly use your revision history?

Source control gives you a full history of every change that's ever been made to your codebase since its beginning. At my current company they place a huge value on that history, so much so that they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits.
Obviously the write-only paradigm is useful when reconciling changes with others and when reverting recent, broken changes or recovering accidentally-deleted work. But to me, it seems like there's diminishing value the further back you go. I can't imagine getting much value from trawling through two-year-old commits, much less twenty-year-old commits.
So I ask: at your company and in your experience, do you get value from source-control-arachaeology? And if so, what does that look like in your case?

👤 a1studmuffin Accepted Answer ✓

Hell yes, I use it daily. I'm in AAA gamedev and the codebase I deal with goes back 20+ years. The last 10 years are readily accessible in Perforce and the rest can be found in another version control system. I am forever grateful to past engineers for outlining WHY they made their changes, and not WHAT the changes were per se. With thousands of engineers that have come and gone, this is incredibly useful information in addition to the code itself.
IMHO revision history is just as valuable to a company as the code itself.

👤 dang

Frequently. Just this evening I was looking in the HN repository for the last version of the code that pg wrote, to remind myself how he used to do something.

One of my favorite tricks is to make a file out of all the changes in the history:

  git log -p > bigass

and then grep through the file (edit: which I like to do in Emacs—hence the file) to see every appearance of some construct. There's a lot of knowledge in there. It's particularly useful when you remember that you did something, but forget how you did it.

In fact, I use git proactively this way, to store things in the version history that I might want to remember later. For example, if I write exploratory code to test out a feature or throwaway code to do some analysis—anything I might want to use again, but don't want to commit to the codebase—I'll add it as a commit and then immediately revert the commit (i.e. make a new commit that deletes what I just added). The codebase remains unchanged, but what I just did is now there forever for future me to recover.

Such an approach only works if your system is small, but I like to work on small systems and prevent them from becoming large systems. There's a beneficial feedback loop here: as you get comfortable working with history, it gives you more confidence to delete things, helping to keep the system small.

I've also found this technique useful for solving the chronic problem with documentation: that it inevitably fails to get updated. When I write something about the code, I commit it and, as above, immediately revert the commit. Now it's permanently glued to the state of the code when I wrote it. When I read it in the future, I can do so alongside a diff of the code from then to now. This makes it easy to see what has changed in the meantime, in which case I can update the document and commit/revert it again.

👤 afc

All the time!
Two weeks ago I found something in a critical library at work (that ~every single C++ binary we run depends on: our main implementation of our custom threads' executor API) that made no sense. I couldn't understand why a variable was being rounded before being passed down to a lower layer, in a way that introduced an average 0.5 Ms of latency to many operations (I estimate that at peak, just one of the binaries that I maintain, a caching system, runs this code at least 200 million times per second), for no gain that I could see. There even was a comment attempting to explain why the rounding logic was added, but it was factually incorrect. As far as I could tell, I could just delete the rounding logic and everything would just work. I was baffled.
... until I looked at the code history! It explained it immediately (well, in like 5 to 10 minutes): the code from 2013, when the rounding was introduced, was calling into some lower level API that received parameters in a way that had limitations that ... Well, let's just say made it very clear to me why the rounding had been added.
Someone cleaned up the lower level library in 2016 or so, but the rounding remained in the upper layer.
This is just one example of many. I do this all the time.
Just two days ago, I was running scripts to extract lines-of-code by author and reviewer over different directories to get a sense of the size of the contributions of different team members, as part of the employee performance evaluation process (obviously, LOC is just one of many many many signals, and has to be taken in context). "Interesting, this person has already contributed 4k LOC to this particular directory, I didn't realize that!" Or "Source code files in the directories of the components that this person is a Tech Lead for had contributions from 131 engineers in 2019; of these, at least 56 engineers contributed more than 100 loc."
I guess I'll call out also that when I find a reproduceable bug that I can't explain, being able to binary search in the code history until I find the first change that exhibits the bug can be a life saver. I don't do this very often, but I estimate that, when I've done it, it has saved me days, possibly even weeks, of work.

👤 hannob

I think there's some indirect psychological value that shouldn't be underestimated.
People have a tendency to comment out unused code "in case they still need it". Or not delete unused stuff, because who knows what.
I have the feeling that I'm much more inclined to just delete a bunch of code lines that "I might still need in some situation" if I know there's version control. Because even if it's unlikely, "I can get it back if I want to" is a good feeling.
I think this leads to less cluttered code overall.
Also something that came to mind: When the shellshock vuln was discovered in bash noone really knew when and how it got introduced, because it was so old (literally decades) and there was no version control in that time. I don't think anyone suspects any malpractice with shellshock, but think about it: If you find a really strange bug that looks like a backdoor, and it's 10 or 20 years old. Wouldn't you want to know who committed that code?

👤 rictic

Yes, but maybe not in the way that you're thinking.
The scenario isn't "I'm gonna go browse the changes that were made in March of 1994", instead it's trying to solve a specific mystery.
You see some code that doesn't make much sense, so you look at git blame to find the commit where it was written. Look at the full change, read the commit message, and now you've got some more context. Often this is enough to understand, but if not, you can check out the code at that time and read the implementation of related systems. Soon things are starting to make sense! Certainly they make much more sense than they did when you started.

👤 inertiatic

Depends on your project/business.
On my current job, I very rarely go back to see when something was changed, because the business requirements are very straight forward. A change needs to happen, and the implications are clear. Also, no one really documents discussions systematically, commit messages are rather short etc. Not much value can be extracted.
On my last job, a system with over 15 years of history, my team was often puzzled with the existing codebase and the seemingly weird things it did. "Who wants this?", "Is there a usecase for this?" and "Do any of our customers actually expect this functionality if we remove this?" was a frequent question.
Then we'd check the commit history and get the 3-4 tickets involved in the functionality's history. Long discussions and back and forth with the client, explanations why the functionality was being added etc.
This archaeology was so frequently fruitful that all the team engaged in it.

👤 bacon_waffle

We just switched from Perforce to git at work, and about the first 2/3 of a project I work on got squashed together. It took me less than a week to bump in to that "initial commit" when trying to figure out why a bit of code is the way it is.
"git blame" (or the p4 equivalent) is my usual archaeologic tool in this context, but "git bisect" has been very helpful in others. For the first, it should be easy to look at your current codebase in SVN and see how far back the history goes in any particular area. I've found that bisection is most useful for relatively recent history, because I usually have wanted to build or run the software to test for a bug or something - beyond some point in history that becomes impractical.
Moving from SVN to git shouldn't require losing history though...

👤 Darkstryder

In 2013-2014, I was tracking a strange bug in a legacy accounting software.
I was reading the business logic that triggered the bug and it made no sense.
I activated the blame view of the code, and I realized most of the code had been written in ~1998 but a couple lines had been updated in ~2007, by someone who probably never even met the original author.
Realizing that made it a lot easier to understand the context of the bug and fixing it.
There is a lot of value in knowing that two lines of code next to each other have been written decades apart by people that did not coordinate with each other. Never erase that history voluntarily.

👤 astura

Yes, yes, absolutely. I've looked at history going back 10+ years (at least). Many times. Two-year-old commits I consider to be fairly recent.
I can't remember a specific reason why off the top of my head, but it was usually something to do with looking at the context around why some piece of code existed. The companies I've worked for also require commit messages to contain bug tracking IDs, which can provide further context.
There's also really not much of a reason to migrate from svn to git if svn is still working for your organization. Whenever the topic has come up previously in my workplaces it ended with "nah, svn is still working fine for us." OTOH I was involved in a migration from CVS to SVN because of limitations/problems with CVS.

👤 JesseAldridge

> they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits
lol, I don't think that's the reason. At the only place I worked that used SVN the real reason was that the old guys didn't want to learn something new.

👤 tsimionescu

Yes, absolutely, and failing to check the history of some changes can easily lead to re-opening closed bugs!
For example, we once had a customer-reported issue in an older version of our product (customers were complaining that an automation script for our product started lasting minutes where it previously took a few seconds). After some investigation, it turned out someone had deleted some code which excepted the scenario in the customer script from a timeout.
The commit removing the exception had a bug attached - the QA team had been complaining about the expected timeout not applying in some cases, and someone found the exception in the code and deleted it. They had no idea why the exception was there in the first place (according to the bug chat logs) and didn't bother to look back in history to see.
Funnily enough, looking back even further in history, we found that the exception had been introduced a few years prior, after a customer had complained that... some automation scripts were taking too long... the same automation scripts that we received in the new complaint, give or take a few years worth of additions.

👤 emiliobumachar

Besides piling on with the others saying I use it a lot, I'll remind you that the old history can still be available, in a separate read-only repository, after you've moved on to a new tool. Not as convenient, but still available.
Some use cases are:
a) "Blame" tool, which produces a file version annotated line-by-line with who and when last changed that line, along with the commit message. "Who the f*&# did that s%$@#? Oh, it was me again..."
b) Searching the history of a file by keyword. Especially useful when something was deleted, and as such no longer exists in the source code, but you can find it by searching for the commit message. (knowing you can later do this gives you more confidence to actually delete things, instead of commenting them out or leaving them there in case they become necessary again)
c) "All I know about that feature is that Jenny implemented it before she left the company." Filter for Jenny's user tag.
d) "All I can find about that change is this old email saying it had just been done." Look at logs around email date.

👤 remmargorp64

We switched from SVN to Git about a year and a half ago. I often use the "annotate" feature in my editor to see the history of lines of code (to figure out who to talk to when I have questions), and I routinely run into the "initial commit" wall from when everything was squashed.
I wish that when the team had migrated from SVN to git, they had used a tool that would have preserved the history. It's very easy to do! I don't know why they didn't. They did it right before I joined the company so I never had an opportunity to show them how.

👤 mr_tristan

Git annotate is a fantastic tool, even if the developers aren't great contributors.
A lot of developers do not write good statements. They don't even link to a ticket. But you get to know those developers real quick when you're doing spelunking using annotate. And developers who don't write good commits probably didn't leave any other documentation behind of use.
I've used this to illuminate "technical debt" from a different perspective. If you take a critical code path, find the important commits for critical logic, and then just show the "context" you're left with, you'll often be able to say "this is why your quality sucks" in a real concrete way.
Managers love proof, and showing them what little context you have for critical areas can be a very different way of looking at the quality of their systems. Otherwise, I've often seen a LOT of overconfidence largely because "we have automation in place".

👤 larsrc

Frequently. Whenever I run into something that makes little sense to me, I go back and look at the commit messages, the related bugs, the evolution of the code, and who wrote it.
It's called software archeology. It's not important if you keep exactly the same people working on the same project and they have perfect memory. But if you, say, move people between different teams, or lose people, or hire people, it's a gold mine.

👤 aprdm

Yes! My company has source that is 20-30 years old, sometimes just going through the history to see who has committed on it and checking if that person is still in the company is already a win, other times there's a code review link or a ticket system link that gives possibly more context to why something was made.
(Granted code review systems and ticket systems change overtime)
Git blame on gitlab is also a good way of getting context of why something is there to begin with.

👤 Too

If migrating all old commits is such a challenge (it shouldn't be), then a workaround could be to keep your SVN online but in read only mode. If your git blame shows "Initial migration commit" you just move over to blame in the old svn.

👤 flohofwoe

Source control history is absolutely essential for me for debugging hard-to-reproduce bugs. The first step is a blame (both git and svn have this) to find out when the last changes to a suspicious piece of code happened and what has changed. Also who changed it so I know who to talk to (if they are still in the company). If the commit comment gives additional info, why the change happened that's great.
Sometimes those changes are over a decade old (of course such old changes make it more unlikely that they are still buggy, but new changes may interact with those old changes in unexpected ways).
So yes, the older a code base, the more important a complete change history becomes.

👤 sakoht

The last time I moved a large codebase from git to svn the surprising thing for me was that a git clone was _smaller_ even though it had all of the history, than svn was holding only truncated history. Everything runs faster, and the tooling is better. They will feel like they traded their gocart for a Ferrari.

👤 NextHendrix

Revision history is essential for traceability in safety critical work. The entire history with name and timestamps can show both who introduced a problem, but also gives context for how a more subtle architectural issue got baked in slowly over time by multiple people. It can then be fixed, and possibly the process can be updated to help avoid such hard to spot issues in the future.

👤 zenexer

Yes, we frequently review revision history, and it’s not unusual for us to go back about a decade. Unfortunately, prior to that, source control wasn’t used, and that has made some tasks fairly difficult.
A great example is data migration. Infrastructure changes over time, even if only gradually. Databases get upgraded and moved around. Recently we realized that some data we migrated nearly a decade ago had significant inconsistencies. We didn’t have full revision history, but what we did have was enough to piece together the puzzle over a period of several weeks. If we had full revision history—which would’ve gone back about two decades—the job would’ve been much easier.

👤 oarsinsync

If you find something wrong in the code, you can fix it with or without the context with which it was written.
If you fix it without context, you may not actually fix anything, and actually create a broken state. This may be a new bug or a regression.
If there's more context, you're less likely to fall into that trap.
Of course, this is all moot if there's decent documentation, but I've never been employed in a place that does. Everywhere requires reverse engineering / archeological expeditions to understand the mistakes of the past, before accepting them as necessary evils, or fixing them without breaking the side effects of the mistakes.

👤 scioto

One of the more frequent uses of source history where I work is to see when an issue was introduced to help gauge the priority of a fix. When something is discovered and someone wants to make it a stop-the-presses fix-it-now type thing, we look back in history to see when the change was introduced. If it's been there a while, and especially if an external customer has never created a ticket on it, we come back and say, "Well, it's been that way for five years now, so why don't we just put it into the next scheduled release instead of an off-cycle emergency fix."

👤 mceachen

Reason 1: keeping codebases cleaner.
By keeping commits larger-grained (especially if I'm deleting a functional component), it supports deleting with abandon, and follow the "you aren't going to need it" (YAGNI) principle, rather than having large commented-out sections (or worse, large sections of deadwood in your tree). It also allows you to restore it later if you need it again by only reverting one commit.
Reason 2: finding out WTH went wrong.
By having a master/stable branch and a development branch, if anything goes wrong, I can always diff between the branches to see what/how things broke. Sometimes it's a change to a dependency. Sometimes (ok, most of the time) it's a change I made.
This said, I think it's useful to me because I know what's in the history already. I think looking through commits from someone else with a tree that I'm not familiar with is going to be of very limited use, especially because people don't generally provide the critical answer of "why" a change was done in the commit log.
Related #protip: always try to describe why you're making a change in the commit log.
The how of the change is already there: it's the diff. Why you're either making the change or choosing a specific method over another can be invaluable to the Engineers of Tomorrow, and prevent them from a regression due to context loss/tribal knowledge loss.

👤 tomstuart

Yes, every day. Well-written commit messages are like comments that never go out of sync with the code they’re attached to. Often it’s the stuff from years (vs months) ago that’s the most valuable, because that’s the information that nobody who still works at the company can remember.

👤 LeifCarrotson

Not frequently, as some other commenters have claimed to do. But when I do, it's invaluable.
I build custom automation equipment, which involves individual 100-400 hour projects. They're developed in a continuous scratch-to-complete flurry, with a few days of revision after customer review and installation, a year's warranty that typically involves 1-5 on-site days, and an annual "we never read the manual please remind us how to calibrate it again" for the next decade. Very little maintenance coding, lots of fresh feature development.
I disagree that there's diminishing value to older commits. You're more likely to forget what you did the farther back you go!
I'd estimate that I use revision history maybe 0-2 times in a typical project. But that's an easy way to recover a couple days of work that would otherwise need to be rewritten from memory, or worse, reengineered from scratch! You can write a lot of commit messages in 16 hours, so one incident where you can recover two days of work makes two months of using version control without ever referencing the history worth it. Plus, it's a nice security blanket for me, I don't worry about commenting around old code or making changes to a reference implementation I'm modifying because I know it will be in version control.
I do think it's exceedingly unlikely that you'll suddenly decide to revert to the state of your codebase from 20 years ago. If you transitioned to Git and kept the SVN repository around for the rare occasion when you need to reference it, at least in my projects, you'd be able to do so without much trouble.

👤 nitrogen

At my current company they place a huge value on that history, so much so that they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits.
You can convert a repo from SVN to Git with history intact!
There's a tool called cvs2svn that I have used to upgrade really old CVS projects to git (it can do git too), and there is also an svn2git. And, I believe there is git-svn that provides a git interface to an svn repo.

👤 james_s_tayler

When you're trying to solve a mystery one of the first questions you ask about a piece of code you suspect might be involved in the problem you ask "what does the commit message for this say?"
It often contains valuable clues.
Less often you will want to know "how has this code changed over time?" or "was the code like this originally, or did it used to look different at some point in the past?"
Commit messages often say why something was changed. Well, good ones do.

👤 bartread

Depending on use case, anything from weekly to very occasionally, but in all these cases it's invaluable. E.g., selectively reverting commits that are known to have caused bugs, checking something whilst preparing release notes, working out how we did X in the past, etc.
> At my current company they place a huge value on that history, so much so that they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits.
I assume they've actually tried to do it? I ask because there's a bunch of tooling and at least one reasonably well understood process for achieving this and preserving history so it's pretty low investment to try it out and see if it works.
Here's Atlassian's version, for example:
https://www.atlassian.com/git/tutorials/migrating-overview
(I will grant you that figuring out how to navigate to the next page of the current tutorial at the bottom of the page is unnecessarily complex.)
I suspect with 30 years of history it's going to take a very long time to do the conversion (days to weeks), but you can set it off and leave it running. Once you have your initial migrate done you can set up syncing to git, and then you need to pick a time when everyone will stop committing to svn, allow a sync and verification window of a few days, and then everyone starts using git.
It gets more complex with multiple projects ongoing, and scheduling around releases, but making this happen is more a matter of will than battling complexity.

👤 fredrik-j

Yes, regularly. Some archaic code has a surprising longevity. Personally at least once per month I end up with a case where I wonder why some code was implemented or for what purpose. Context that is rarely documented in the source code, but is often exposed at least implicitly through commits, commit messages, date or authors.
I strongly advice against abandoning revision history just because it is easier to just start fresh from a single git commit of the current state of the code. Especially so for code that has been in use more than a couple of years, where the developers may have forgotten the purpose or who did what.
Surely you can convert the svn repository to git with history intact? We did that when we migrated from cvs to mercurial. If it is too complicated to do directly from svn to git, maybe it is easier to convert via mercurial, i.e first from svn to hg, then from hg to git?
https://www.mercurial-scm.org/wiki/ConvertExtension

👤 marcosdumay

> they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits
The decision of migrating the repository or missing the commits is a false dichotomy. One can deal with two repositories without much of a problem, it's only a little slow down at the rare event you have to look at it.
Anyway, that applies only if you do have a reason to migrate.

👤 kazinator

There is less value in old commits, but you don't know which one will prove valuable, so all of them have to be there.
2019 fix, of a 2011 breakage:
http://www.kylheku.com/cgit/txr/commit/?id=3a91828748385d8d6...
2020 removal of 2009 misfeature:
http://www.kylheku.com/cgit/txr/commit/?id=24bd936a9fa671599...
The TXR project only goes back to 2009.
We can fix these kinds of things without reference to the past, but the process would feel uniformed and impaired.
Not everything is in the code; there are sometimes questions of requirements, which are not always properly captured in documentation.
We need all the historic questions to be able to figure out the whole situation: what happened to the requirments as well as the code, and how it all relates.

👤 Arkanosis

Almost everyday, often several times per day.
Changes made more than 10 years ago help to fix bugs still present today even if the codebase has changed a lot (they have commit messages, link to old tickets with more discussions ; sometimes, just the name of the committer tells a lot about what to expect from a change).
I've spent a lot of efforts when we started migrating from SVN to git to not lose this, knowing the pain of not having the history go far enough (some of our projects were already migrated from CVS to SVN a long time ago, and histories where lost then). Efforts have been more human than technical, BTW, since not everyone was aware of the value of the history — usually bugs in the oldest parts of the codebase get through only a handful of people who have been there for a long time, and other people tend to take for granted that we understand why something is the way it is.

👤 yardstick

Our codebase is around 20 years old and was in CVS, then SVN, then git. Then several years after git, a new git repo without any history due to poor use of the first git repo (someone added binaries, bloated the repo to GBs instead of maybe 200-300MB, which made git export horridly slow).
In all the steps we preserved the commit history, except for the final git->git. However also when we moved from SVN to git we kept the old svn server running as a historical archive for several years, as we didn’t carry across all projects (some were already EOL’d years ago).
During that time I looked at it maybe twice, and ultimately we decommissioned it.
Likewise with the new/old git repos, we still have the old git repo if we need the history.
One final thought: git blame was nice, until someone reformatted the entire codebase and committed it back in (we’ve since adopted better git workflow and code review practices!)

👤 warpech

GitLens[1] is my favorite VS Code extension. This extension shows relevant revision history directly in the code editor. These features are called current line blame and authorship code lens.
[1] https://gitlens.amod.io/#features

👤 jedberg

Yeah I've used it before.
Once I did a git blame on a file, found that the offending code had been committed nine years previous, and was able to figure out why the code was the way it was by looking at that nine year old commit and all the other code that had changed with that commit.
The nine year old context was super useful.

👤 unsigner

I'm working on a 15-year old codebase, and I was here from the beginning. I use 'blame' daily to make sense of code - why it was added, for what project, for what feature, to fix what bug, and who added it. It's priceless, and I'm putting off stuff like getting rid of stupid homegrown types instead of standard ones, using clang-format consistently, and migrating to a small, newer probably faster repository format because I don't want to use history.
I realize the value of this history is smaller for newcomers to the team.
(And Subversion and it's bigger, expensive brother Perforce still make sense for game development - when you don't really want to go wild with branches or remote work, and when you need multi-terabyte-sized repositories and multi-gigabyte single commits.)

👤 geofft

Yea, like many people in this thread. I'd just like to add, you don't have to block on migrating SVN history. It's actually enough to make sure that the SVN repo remains accessible indefinitely (if there's nothing secret in it, making a tarball of the SVN repo is a fine way to avoid running a server). I don't go through historic code often enough that it would be annoying to find a different repo when I need it, and half the time people's git conversions don't let me make sense of "This fixes a regression in r1234."
Of course, there are other options too, like migrating and then using replace, migrating and then rebasing, etc. I just want to point out that even the lowest effort option is valuable enough compared to throwing away history.

👤 EdgarVerona

Just a few weeks ago, I had to dig back to try and root cause a very old bug. It ended up being in code from back in early 2012, but the root problem was drift in functionality that it didn't keep up with and eventually broke in an unexpected way. I would never have known the context around why that code existed in the form that it did without being able to go back through the history.
Granted, the context didn't really change what the fix needed to be, but it did provide a useful moment of reflection on the ways in which software can break through subtle changes over time that stack up, and it helped to know that the section of code that broke was indeed originally intended to work the way it did (and not that it was a bug from the very beginning).

👤 MaulingMonkey

Absolutely. Here's some cases where I've used older history:
1) We found a nasty heisenbug that crashed with a useless unrelated stack trace, but only if you didn't have a debugger attached, and only if you held open the windows 8 "charm bar" open for more than 10 seconds, and only on the main menu screen. After wasting a couple weeks trying to root cause it with logic, I eventually resorted to brute force bisecting perforce history by hand - and then the changes within the changelist to blame, as it was a large one. This let me figure out it was a bug in a seemingly completely unrelated, closed source system API, that we were calling to check internet connectivity. I had to write a standalone repro case to prove to myself it was the cause, it seemed so nonsensical. I wrote a workaround. This bug was only a few months old though, because QA was able to catch it early enough. The bug likely would've eventually gone unfixed without perforce history.
2) I went to upgrade a 3rd party dependency that we checked in, that hadn't been upgraded in years - maybe even a decade - for bugfixes and such. Except we'd made changes to said 3rd party dependency, so I needed to seperate out and understand our changes to the baseline SDK so I could decide if I should re-apply them to the updated SDK (in some cases yes! I was able to drop others.) We had a web interface to an archived SVN repository containing the commits before our years-old Perforce transition - and before my employment there - which I used to help me grok it all. I might have reached as far back as a decade in this case - very low frequency of commits to that part of the code, however, so "a decade" might have meant "the past 10 or 20 commits", if that. I had to reach out to IT to even get credentials to see said history. Helped turn a nervwracking upgrade into a tame one.
3) We decided to port an archived, years-old project to a new platform. Just seeing the last change made to sanity check if the weird logic I'm seeing might be a "new" bug or not means looking at years old history. This has actually happened to me a couple of times.

👤 apaprocki

It is invaluable when working on a long-lived codebase. There have been many occurrences where a current bug involves code (sometimes that I wrote) from a decade ago. You will find high correlation with the value derived from meaningful code comments :)

👤 unnouinceput

At companies I worked it was used for political games within. Use it to blame bad code, use that blame later on performance review, deny bonuses and/or salary increases.
After transitioning to freelancer and being the sole user of it, it finally allows me to use it for its true purpose, namely refreshing my memory on some techniques. Sometime I copy/pasta code from older revisions on a different project because that's the code I need for current project (while the current code of that project changed due to client requirements changes). Also sometime it's used by clients to see how the status of the project evolved over time, so it also serves as a metric purpose.

👤 vosper

With the exception of rare uses of git blame (hate that name) to figure out who made a change so I can ask them if they happen to remember about it (and if it was more than a year so my expectation is “I don’t remember” which is fair enough)... virtually never. I don’t care if my git history is “clean” or “dirty” (if you never use a tool to make a visual of the branches then you’ll probably never notice or care).
We put Jira ticket IDs in our commits and sometimes that’s useful. But the value tends to be in the content of the tickets as much as the commits.
If we decided to squash everything more than a year or two old into a single commit I doubt it would affect us very much in practice.

👤 brilee

I'm enjoying all the comments here!
Thank you OP for demonstrating the effective use of Cunningham's law: "the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer."

👤 paublyrne

I do, I think. I often blame individual lines to see if there's information in the commit about the reason for the change. It's often straightforward to understand what's happening when reading code, the information that's often missing is the why.
Sometimes there are clues sometimes not, but you can often see the line change in the context of changes to other files and that can help.
For this reason I tend to make quite verbose commits with the context of why I'm making the change. A comment in the code would go out of date, and pollute readability, but a well written commit can be very useful.

👤 lmcnearney

I’ve inherited code bases that migrated from SVN to Git without converting and bringing over the old revisions. This has resulted in a number of times where I hit a cliff when trying to identify when a piece of code was changed as everything points back to the initial “SVN import” commit. Couple that with the original SVN repository being unavailable (either lost or just not worth the trouble of finding in 10 years worth of offline backups) and I would say yes, having your source control under one system with all of its history is ideal.

👤 megous

The more public and more developers it has, the more I use it. The project where I use it the most is Linux kernel. It's also probably because there's a strong requirement to actually write useful commit descirptions. But if changes come from so many developers, it's very useful to catch up on news, what's coming to the next kernel release, what changed in what driver or subsystem, what might have caused the regression I'm seeing, who to contact, etc.
On personal projects I don't use it that much.

👤 flir

Constantly. Devlopment is a process, and artifacts fall out of it. The obvious artifact is the source code, but others include documentation, revision history, tickets and infrastructure configuration. We should be taking as much care of these as we do source code.
One example I don't think I've seen mentioned in this thread is that sometimes a change touches two widely-separated parts of the code. The commit message may be your only opportunity to comment both parts at the same time - to tie them together.

👤 wruza

Yes, but not for a distant past (which is relative). Sometimes it is a revert to revision thing, sometimes I just remember a revision as a base for an ongoing refactoring. It must be done in a branch, but when I work alone on my thing, I just break trunk and do commit broken tree at every evening (or at logical points, whichever comes first). Besides refactoring, nope, write-only style. All the variants of “knowledge base” code I need to reference to are in a separate dir/repo, ready to copy, re-experiment, revert back or commit. VCS is spacetime - you can use time and you can use space.
Also must confess that I never seen much profit for myself from commit messages apart from one-liners like “broken”, “savepoint” and “fixes to bar, uploaded foo”. If trunk has a problem, you just can blame and get an exact revision. If you search through a history, use a gui tool / ide that can fetch it quickly and compare to head, then bisect manually. I don’t make hundred-pagedown commits, so that’s easy enough.
For a future employer: that doesn’t mean I’m against or unable to make branches and write good commit messages. All above are just obvious shortcuts that my own “garage” projects tolerate with no downsides. Personally, I don’t get why some guys freely decide to break project rules when at work - and it was frustrating when they did it to me.

👤 coffeeaddicted

Yes, while maintaining code which other people wrote who are no longer around I often have to figure out what they might have tried to do. Being able to see how code looked before someone rewrote it can often give you an idea what it's about. Even better if the people even used good commit messages which explain why they fixed something and why they did it the way they did it.
Sometimes code only makes sense if you can see it's evolution.
Also knowing who wrote it you can ask those people sometimes about it.

👤 zaccusl

It's not too hard to migrate from SVN to Git.
I did this for a 15+ year old very large code base. I tried various recommended techniques but everything would fail at some point or another (usually after many, many hours) and I'd have to start over.
Finally I wrote a very simple program with logging where the logging also records the current state and on any failure I could start from any point in the log.
The idea was simple.
1. Check out SVN version N
2. Parse the changed files list for file between commit N and N - 1
3. Copy those files to the Git folder
4. Parse the commit log for version N to get the commit date, committer name, and commit message
5. Commit in Git using "[@] " (this preserving the original commit information)
6. Repeat with N + 1.
If it fails at anytime then simply reset both SVN and Git to the last successful commit and restart. I also did a binary compare of the entire directory tree every 100 commits to ensure the copies were identical.
The process took about two weeks running all day and night (since one commit at a time is very slow) but it was very robust and left a perfect version history.
To deal with the fact that the SVN repo was still live, I believe I mirrored (or something like that) the repo and would sync between my local mirror and the live repo every couple of days. When my program caught up with the live repo we just stopped commits for a few hours while I wrapped everything up and then archived the SVN repo.

👤 armaizadenwala

For recent commits, it is useful for figuring out causes of bugs that have been brought up recently by checking out to an earlier commit until it goes away.
As for commits that are over 2 years old, they still serve a purpose. For a legacy app that I worked on, I had `git blame` ran on every line (vim and vscode both have support), and I was able to see who worked on a block of code last. Sometimes, those developers are still there and available to ask questions which has helped me greatly.

👤 axegon_

This kind of depends. Many years ago we migrated an SVN project to git(largely because of my constant complaining about SVN which I truly hate) but we didn't bother with keeping the history. We simply kept the SVN repo for another ~2 years if we ever wanted to check something. At a certain point when we felt like the project was stable, we got rid of it. I've gone to looking back to where something came from on a number of occasions so I would say it is valuable.
Which applies to any project on which a number of people is working on. Especially when there is a bug, git blame is a life savior. Which potentially has a lot to do with me being annoyingly pedantic about commit messages and branches. I did however had the "pleasure" of working with a guy who's branches were commonly called "bugfix102015" and commit messages along the lines of "fix some bug". In such cases there is not much you can do when shit hits the fan.
For my very personal projects - hardly. Much like you, if something has been done 2 years ago, chances are it's working fine as it is, or you are not using it at all. So for personal projects, digging years back is something I don't ever recall doing.

👤 thrownaway954

all.. the... time
if you have never experienced the raw power of "git bisect" when trying to hunt down a bug, you're missing out.
using git bisect can literally save your life in terms of stress. I think it one of THE most important tools in git that developers can learn. it shows exactly why we should commit small and commit often.
https://www.youtube.com/watch?v=REaowJ8JSfw

👤 c3534l

I did work for a company where they had a few important vendors who would run legacy versions of their software and demand a bugfix for that version. They did this only once or twice a year, but being able to reproduce and patch old versions was enough to get some clients to shell out absurd amounts of money. These companies were swimming in cash and didn't like to learn anything new. Needing old commits is rare, but important.

👤 redis_mlc

As a DBA, I use git blame to see who wrote slow queries. Then I assign a jira to them. :)

👤 dd82

Yes, especially with projects with long history. You want to know who did what at a particular code block, and what it was like before. Since we tag all commits with a ticket name/number, we can go back to Jira and see the ticket that was responsible for that change.
This helps a ton when refactoring code in a project with alot of history. I know that this bit was done this way for a reason, but that reason could be anything.

👤 perlgeek

> So I ask: at your company and in your experience, do you get value from source-control-arachaeology? And if so, what does that look like in your case?
I work with different code bases, some have 20+ years of history (migrated from RCS to CVS to git).
There's no week where I don't go back to look at some kind of history, usually to find out why or when something was done. Often the issue keys / ticket numbers referenced in the commit messages help me when the commit message itself is too opaque to understand.
I also like to get a sense of how often a file changes. This gives me a sense of whether the code is likely to be fragile and/or touches often-changing requirements.
There is a diminishing return for very old commits, partly because our team was much smaller back then, and communicated less in writing, partly because too much of the context has changed. But two years doesn't qualify as "very old" here, in our case the diminishing returns start more at 5 to 8 years.
That said, if I were working with SVN again, I'd likely look at the history much less, because it's that much slower and more painful.

👤 acwan93

Unequivocally yes.
I switched companies (FANG) to my family software company and the company had been using Microsoft Visual SourceSafe. The company was only casually using it, as one computer was used to compile customer executables (and fix compile-time linker errors) and was often times never checked into VSS. Needless to say, no one on the SWE side knew if any code actually worked or when anyone did anything.
Part of this was lack of management, part of this was inadequate tools. After I joined and learned about the horrors of VSS, I switched our company immediately to git (there was some initial resistance). While there was around 10-20 years of VSS commit history to migrate over, having git blame immediately in VisualStudio and any git client makes a world of difference. While legacy code can’t be cleaned up immediately, the team’s mindset has changed so that there’s no more commented out code (“in case I need it later”), no more new duplicate implementations of the same business logic, and a person to blame for software bugs :)

👤 sime2009

Very rarely, beyond a couple weeks of commits. In those cases it is more a question of "did some particular commit hit branch XYZ?". I wouldn't call that source-control-arachaeology.
I'm at the stage where if someone suggests that we try to keep a linear history in git I push back and argue that it isn't worth the extra effort compared to the gains.

👤 kazagistar

The most valuable part for me has been ticket numbers. We prefix every change with a ticket number, and it provides an easy way to answer "why" for any line of code. I wish I could use the comments as well, but that's harder to enforce quality in.
I use git blame and history at least once a week when bug sleuthing, and value it very highly.

👤 alkonaut

Yes. I do git blame/annotate every day on a code base that is over 20 years, 100k commits. I migrated it myself to svn in 2007 without history which I regret because a lot of changes now look like they are from the initial svn commit in 2007 when in reality it had an older history. For that reason I spent a lot of care when migrating to git to include all history but also trim away some dead parts and mistaken commits from the past. Getting a chance to clean up history is great. Migrating an svn repo to git was definitely worth it. It was not a huge logistical challenge as there are great tools for it. The hardest part was finding which tool to use.
The most common use case for archaeology is to find who made a particular source line change 10 years ago and just ask them something. I often find it’s my own code...
People usually remember at least vaguely why they wrote the code even 10 years ago.

👤 ptsneves

I normally do not comment code unless it is a dangerous hack or a todo. Otherwise I make a small commit with a commit message explaining the why of the change. The git blame of our code base documents the why of almost every line and reviews are failed if they describe the what in a one line.
when we need to dig history of a line we git log -S

👤 tsian2

It only took a day (actually a few partial days) to make a script that could migrate projects from CVS to Git. The importer we used kept all the comment history and even converted multiple commits at the same time with the same comment into multi-file commits. So no need to go looking through the old VCS for the old stuff.

👤 time0ut

Yes, all the time. I frequently need to know who made a change, what ticket was associated with that change, when was it made, what did the code do prior to that change, etc. We did make the change to git almost 10 years ago. I wasn't directly involved, but we managed to do it in a way that largely preserved history.

👤 slifin

It's crazy how in agreement we are that keeping our records is important for legal defense, debugging, auditing, finding out how things came to be etc
Then for our users it's like you can have one name and one name only if you change it, it's going to be that name forever in the past too
Tables should have revisioning on as a default

👤 awwaiid

I have it set up (and many do) so I can highlight a bit of code and see when it was created or touched last, who touched it, and usually what ticket the work was done against -- the context of the change. This gives a whole third dimension of understanding a codebase:
* Dimension 1: Code layout (organization) structure
* Dimension 2: Execution (data-flow) structure
* Dimension 3: Evolution (change over time) structure
All three dimensions try to capture some of the intent of the implementer, and understanding that intent is very important when improving upon the work. Along with that comes a perspective on what assumptions the author had. Code last modified 5 years ago very likely had different assumptions than code written last week -- Being able to see which lines in a function came from which era can illuminate things nicely, and that is only scratching the surface of this evolution-dimension.

👤 eveningcoffee

Yes, we get a lot of value out of version history. It is a good tool to evaluate existing code. We also have experience with leaving the history behind from migrating to git.
Our case was a tooling policy issue but it should be possible to migrate from SVN to git and keep the change history. You should investigate this option.

👤 quickthrower2

I very rarely go back more than a month or so. Very occasionally I have gone back 10 years but only for the weirdest problems and to get a sense of what the person was thinking when they added that code. I think I could live with just 6 months of history, and in most cases 30 days and never have a problem.

👤 thih9

I use it to find more detail about a particular line. Often knowing when something was added, who added it and what was the commit message helps me understand the reasoning behind some particularly weird or legacy code. In some projects this can lead you to a pull request link, with even more context.

👤 davidfstr

Yes, responding to the title. In particular in my rich web application if I am in the process of merging & deploying the branch to the production environment, sometimes the automated tests fail and I use a bisect to narrow down the offending commit. Having good commit messages in general helps with debugging in this scenario.
I’m surprised you haven’t found excellent tools to migrate from SVN to Git, considering how popular these VCS systems are.
I once worked in a company whose source tree originally predated version control. There I found an entire module that appeared to be dead code and I wanted to determine how it became dead. I did a bisect and landed on an 8 year commit that was the very first commit in the version-controlled tree. Yikes. So I guess I’ll never know how that module became dead.

👤 hughw

I've done it both ways. I've had to partition a years old code base, keeping an old SVN repo up to a certain date, then using my own limited git-svn skill to promote some recent months to git, to establish a git repo for use from there forward. At another company we had a magician who ported maybe 10 years of SVN history to git over a weekend, and we were able to abandon the SVN repo.
In the first case, I don't recall ever needing to go back into the old SVN repo, spelunking for "how we used to do it". But the capability was there, with the minor hassle of not having a single repository to search. The git repo, with some minimal recentish history, soon became the authoritative source, and we never looked back.
[edited to clarify the partitioning of the first codebase]

👤 acdha

It’s not every day but it’s extremely useful when you need it. I’ve used that history to find context for when someone made an otherwise unexplained change - tickets; names of people, projects or departments; the commits immediately before or after; etc. can all be really handy for learning why something works a specific way. (“Why are we pinned on this ancient version? Oh, that server was decommissioned years ago - we can drop it “)
In your specific example, git-svn works really well for maintaining that history including authorship. I have a few projects which predate Git existing and it’s been quite usable for history. You can’t direct link to a commit ID but Git searches are very fast (we’re not on 20 year old hardware) and you shouldn’t be doing this many times a day.

👤 ch33zer

Every day. My team owns a 10+ year old system. None of the original authors are around any more. We're doing a massive migration of the system (several of them actually) and being able to go back and understand why the code was written the way it was is great. Earlier this week I needed to understand a very odd piece of code. It was making an rpc and if that threw an exception it was trying the same exception in the exception handler with different parameters. Looking back through the history turns out this was because the code used to do something different but during a migration was changed sick that the second set of parameters made no sense. One I understood that contact the fix was trivial and I felt safe making it

👤 valand

Using them all the time:
- Looking at root causes of undocumented weird hacks and technical decisions
- Finding culprit, the one that causes a bug, in order to remind them to do better.
- Finding out someone underappreciated contributor
- Reverting changes.
- Cherry-picking changes.
A REMINDER: Revision history is great, but so is flexibility and velocity. You can always cut off history and use another tree (e.g. when moving from SVN to git), make a documentation about it, keep the SVN history as an archive and use git.
If a decision will boost velocity, flexibility, and sacrifice less valuable thing, you should do it, but make sure you will have a fallback.
In the end, flexibility is what you will need at every level (code, product, company) because the world around you (and requirements) always changes and you'll need flexiblity to be adaptive.

👤 kennu

There is value in keeping track of who wrote each line of code. Even if it's quite old, you can still figure out the persons that originally worked on a particular project/module and ask them about it. If you reset history, this knowledge is lost.

👤 thewebcount

I use it as a reminder of what I did over the last year for performance reviews. I use a date range on the 'svn log' command to see all of my commit messages and which files I changed. I frequently find smaller things I did that I forgot about and which may have had an outsized impact on our work as a team. I can find refactorings pretty easily with it, too.
I've also used it to do some deep archeology. I had a piece of code I inherited that was always problematic. Eventually, I went through its history to figure out what it was originally intended to do and why it changed over the years. This was invaluable for finally figuring out how to fix the damn thing once and for all.

👤 hyperpape

Constantly. We migrated from SVN to Git at the beginning of 2018, and spent a lot of time getting the revision history migrated. I routinely check the history and field questions from coworkers who saw that I committed code before the transition.

👤 Supermancho

From 2007 forward (1,2,3,4,5,6,7,8,9,10 companies I've gone through) when using git, developers looked at the history infrequently. A couple times in the (#6 company) where we spent 3 years to create a js framework from scratch and a mobile web offering of the HUGE office toolsuite. Most of the time, it was to back out changes or bisect, which requires a revision history. You almost never need it, until you do, mostly for local and your own branches. Features tagged in commits with JIRA support covers the vast majority of needs. Extended comments are useful to explain individual goals for the commit that make up the feature.

👤 alfiedotwtf

I don’t think I could every work in a place that didn’t value some for revision history...
`git bisect` to find when and how a bug was introduced, getting details when it’s time to merge multi-month merges, getting stats about previous projects.

👤 vidarh

Very rarely past a few weeks.
But on the rare occasions I need it, I often really need it. Especially because code that has survived sufficiently unchanged for that long ofte has done so for important reasons.
The amortized value per commit for really old code is likely low, but you get them 'for free' because you want to do them to have them for recent code, and the overall value of having them for older code to the codebase as a whole can be significant.
I'd say the SVN history challenge is an excuse - firstly there are tools that can do it.
Alternatively you can easily enough keep the SVN repo around for those rare occasions people really need to dig.

👤 nicbou

Yes! I performed a major refactor that lasted about 6 months. With the commit history (that linked to tickets with discussions and even mockups), I could understand the process of people who had left the company two or three years ago. It allowed me to tell obscure business logic from leftover code and bugs. I could trace each line of code to its requirement.
I also used it to pinpoint the cause of a bug after updating a docker image, knowing that the bug was introduced in a certain file between certain dates.
Now I try to strictly enforce detailed tickets and ticket numbers in commit messages.

👤 seren

Yes, I use it routinely to do something like
git blame > who changed that line the last time git log > why was it changed
You can quickly find out if this was some trivial typo fix, or an important feature was introduced.
Implicitly, it means than to get some sort of value from that kind of archaeology you have either very detailed git commit message, or really clear bug tracker with all the why, regression tests, etc that were done a that time.
That being said, I think this is a bad argument for not changing/updating your VCS.
You can absolutely move to git, and keep a dump of the SVN base you can still expose and review at will.

👤 jolmg

I do. Most often I get that value when looking through the repeated blame output of a particular piece of code. I facilitated that through this tool I wrote:
https://github.com/jolmg/git-reblame
Last time I used it was last week to see how a particular piece of code was developed throughout the years. There was a comment that didn't explain some puzzling details, and it helped to make sense of it by seeing how the code changed from the time the comment was written.

👤 Nursie

It varies. There was a mostly complete PKIX path resolver I worked on a year or so back, that turned out at the time to be unnecessary so I diteched it. Fast forward a year and it became useful and saved a week or more of work. But that was only useful because I knew it was there, and it was still relatviely current.
What benefit you can get from 30 years of commits I'm not sure.
By the way - it looks to be possible to migrate history from SVN to Git, so if your company needs that, maybe start there , by creating a local git repo with intact history and showing it to them.

👤 WCSTombs

I've certainly gotten some mileage out of `git bisect` over the years.

👤 simonw

All the time. If I'm reading code and think "why on earth is this here?" the first thing I do is hit the git blame page on GitHub.
If a project has a clean commit history, this instantly gives me extra context and hopefully even links me to an issue thread explaining what was being solved.
In older code bases this is invaluable - I often find myself looking at history from five years ago or more.
It's also great for my own projects. Even if I wrote the code six months ago there's still a strong chance I won't fully remember the context for the change.

👤 DecoPerson

Game industry uses source control _a lot_.
It’s an important communication tool. Also game companies tend not to have unit tests, but the culture is very much “don’t break _anything_, and don’t make the game worse,” so devs have to triple-check the intentions & effects of any code/script they touch, to be sure they understand what they’re changing and know it won’t introduce any unexpected changes. Timelapse view (Perforce’s version of git blame) is an essential tool for all departments, especially for anyone trying to figure out a bug.

👤 JoshTriplett

I regularly find myself trawling back through code that's 5+ or 10+ years old, and the best way to understand a detail of code can be to look back to the commit message where it was documented.

👤 lukeschlather

When I worked at a large, decade+ big tech company there were several times I looked at running code that was over a decade old. I was working in a section of the business that hadn't existed for more than a decade, so that somewhat bounded how old code could be in my domain, but somehow my team ended up responsible for one thing that was much older, and understanding it took some reading.
There were other core systems that I also read sometimes that were older, and it was extremely useful to understand their construction and function.

👤 MrTortoise

No not in 20 years except maybe 3-4 times
Shitty codebase and lack of tests defining business needs require it
Also Devs with headphones on not starting decisions ... That way when people leave no one know why it is the way it is.

👤 JohnFen

Yes, I use it heavily. The further back in time I go, the more valuable the revision history is.
Where I work, we use it primarily as part of maintenance. Looking through what changes have been made to a section of code over time very often gives insight into what is causing a current malfunction -- sometimes it even lets you spot the problem almost immediately.
We also use it as part of development and bug tracking. All code changes are tracked by revision number. Even there, being able to look up even antique history can be very useful.

👤 cpeterso

When Mozilla moved from CVS to Mercurial in 2007, the CVS revision history regretfully wasn't imported. A git repository had since been created with the combined CVS and Mercurial history, but the Mercurial repository of still the official source of truth.
https://gregoryszorc.com/blog/2015/05/18/firefox-mercurial-r...

👤 joantune

Yes! So, from my experience it's invaluable to: See why things were done like that. My best usage so far of it was to add a #number at the end of the commit message where number is a trello card id. Then you can go back to the discussion and card that originated that change. You can see all the context of why something was done like that.
This becomes very valuable when maintaining projects that will be running for years, and prevents you from undoing things or going back to doing the same mistakes

👤 jakub_g

At least once a month, when modifying some critical piece of code - git blame is the easiest way to find reasons for non-obvious lines of code.
BTW: if you sometimes move code around between two git repos (from multirepo to monorepo for example), I wrote a script to move a subfolder between the two and keep history:
https://github.com/jakub-g/git-move-folder-between-repos-kee...

👤 boulos

Like everyone here, “Yes, the history is super useful”. So, I’m curious about fixing the actual problem.
What happened in the CVS => SVN migration? You don’t have 30-years of SVN history. Do you have an SVN mirror / backup for which you can try out the git svn to try to import the codebase?
Besides, it may be 30-years worth of commits, but I’d guess it’s smaller than the LLVM SVN repo was at the time of the first git mirroring. How many commits are you talking about? (Including all branches, etc.).

👤 AntonyGarand

I navigate through unfamiliar code almost daily, completely different projects and different authors. I use the blame feature a ton to understand in what context a method was added: by blaming it, I can usually see what other code was added at the same time, and its evolution.
I've used git-svn[0] to use git within svn, it's been working flawlessly in my case.
[0] https://git-scm.com/docs/git-svn

👤 sterlind

Not archaeology, but I use git reflog constantly as my workflow:
I push to master infrequently. I keep a series of topic branches off of master, one per project phase. For changes that affect other developers, I pull those out and PR them to master, then rebase the topic branch chain once the PR completes. When I switch projects I use git reflog to remember where I was working.
Basically I take advantage of git rebase and use it like time travel constantly. Somehow I stay sane..

👤 chvid

Who wrote this piece of junk code? ... git blame ... oh ...

👤 sleepychu

Lots of the value for me is in well written commit message bodies.
Especially if I have written them since when someone asks me how something works or why it's written that way I can read the commit body and explain it to them/refer them to the message (otherwise my answer is, I don't remember!)
Additionally if you choose your changes well and don't squash commits it can be a good guide to what else touches the thing I'm looking at.

👤 mkgolden

I think you do get value from it. You won't know it till you eventually do have to go digging in commit history, though I don't find myself doing this regularly. I have found it useful when attempting to understand why code was changed or written the way it was some years previous before. I have also used it to understand when/where a bug was introduced. A tool I like for exploring git history is DeepGit.

👤 stared

Before using git, my code/comment ratio was ~1/2 as there were many things that "may be useful in the future" (usually they weren't)/
And yes, once in a blue moon there is a change that breaks something, I need to go back and recover some old code. Much more often - I need to see how it worked before.
Plus, let's second the psychological benefit. I don't need to worry or think twice before changing code.

👤 db48x

Yes, quite frequently. You are correct that it is often only the most recent history that really matters, but sometimes the most recent change happened years ago, so date-based cutoffs don't work.
When you do want to convert that SVN repository, use Reposurgeon (http://www.catb.org/~esr/reposurgeon/).

👤 Forge36

Yes: I used it last week to source a problem we've had since 2012. It helped rephrase our discussion about how it was missed, why it was introduced, add discuss why it sat in waiting for 8 years.
More importantly: we have a clear date in this case for what versions we need to considering releasing patch fixes.
In some cases this can be useful: even when the functional problem it causes is not easily evident in previous versions.

👤 foreigner

We got sued about an old feature and I used revision history to recreate a snapshot of how our SaaS used to look at the time of the alleged event.

👤 jarofgreen

> they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits.
I did such a migration 3 years ago at a company that had a 10 year history and it was fine using the standard tool. Is there a particular problem your company has with it, or have they just not tried?
(Also, if they are really worried, nothing to stop them keeping a read only SVN server somewhere.)

👤 sagichmal

PR descriptions/discussions, ADRs, and READMEs are often germane and useful, but the revision history itself? No, almost never. Any rationale motivating a piece of code as it exists in the repo is, in my experience, best provided as a comment in-situ with the code. Information that's in the revision history but not in that kind of comment is, in my experience, historical noise.

👤 hinkley

The p4 import tool can do incrementals so I would be surprised if the sun one doesn’t as well.
You fiddle with it until you get it working the way you like, then you do an import in the background or overnight. It takes as long as it takes but you don’t care. When it’s time to make the transition you aren’t importing the whole thing, just the past week. The older stuff has already been transferred over.

👤 rgoulter

"We won't switch from SVN to git due to the logistical challenge of migration".
It would be worth switching to git if the current technical costs outweighed the costs of the migration, yes.
I think there's more to it than just "we don't need all the history, just squash it and starting with git would be better" (or even "setup authors file, git svn fetch"), though.

👤 travolter

I was recently trying to link our source code against a newer version of LLVM that had introduced a few API changes that I couldn't make heads or tails of. Without the git log I would've had to manually compare the changes and figure out what they meant, or just guess. But luckily the commit messages quite clearly explained what changed in most cases.

👤 uk_programmer

Yes. Sometimes you need to know why a particular thing was done.
Recently I had to go back and find out why a particular conditional was added to the code. 10 years prior someone added in a particular conditional for a bug in IE8 (which we no longer support). There was a Jira associated with it. I then knew I could remove this odd logic as it was no-longer relevant.

👤 kirstenbirgit

I often use the VCS feature of PHPStorm where I can select a piece of code and then immediately get a nice list of all the commits that changed that code. I can then double click on that code and get a list of other files that were modified. For example, I can then figure out why a certain piece of code does what it does, and why it was added.

👤 matt2000

Heads up to anyone using Jetbrains IDEs, they have some great VCS history features like annotate lines: https://www.jetbrains.com/help/idea/viewing-changes-informat...

👤 bjourne

It's very simple to migrate even really old Subversion repos to git. If they claim it is an insurmountable logistical challenge, then they don't know what they are talking about.
Yes, there is diminishing value in old commits, but they are far from worthless! Never ever destroy the commit history. Doing that is imho, a cardinal sin.

👤 ajnin

I just want to point out that it's possible to migrate an SVN repo to git while preserving history. I used svn2git (https://github.com/nirvdrum/svn2git) to migrate many repositories, although not very large ones.

👤 dmarchand90

It's very nice when your dependencies have a thorough history. I updated one of my dependencies and found one of my tests started failing. Very nice to pinpoint the exact modifications that caused the break, and I got a much speedier patch because of it. Depends how often you update your dependencies though of course

👤 house9-2

> so much so that they haven't transitioned from SVN to git solely because of the logistical challenge of migrating 30 years of commits.
Just bite the bullet and convert the repo from SVN to Git?
Guessing the primary issue is the time it takes to convert the repo, maybe a job to do over the holidays when most people are off?

👤 beardbound

I’m a QA engineer and I go back pretty frequently so that I can see when a issue popped up and what might have caused it. It’s super useful for root cause analysis on bugs. also I use it to troubleshoot legacy versions for customers. Although I see less of that since I work on a saas product now.

👤 city41

I use it quite a bit for various reasons. But I think I get the most value from just knowing it's there. I can plow ahead with anything, try anything, experiment, it doesn't matter. I know that my previous history is there waiting for me if the plunge I just took doesn't pay off.

👤 EamonnMR

Git blame shows you when and why a given line of code was last touched. That alone is worth all of the overhead of revision control. Commit messages can contain more context about changes (what story where they for, what bug where they trying to fix.) I probably use it at least once per day.

👤 dharmab

Yes, I often have to go back into commits from a year or two ago to discover the motivation for certain decisions.
https://en.m.wikipedia.org/wiki/Wikipedia:Chesterton%27s_fen...

👤 muzani

Often, but not to the point of more than 3 months back. To me, it's more so I can delete obsolete code instead of commenting it out, without worrying that it'll be irreversible. Or sometimes if a feature is suddenly broken we can trace if any changes were made at the time.

👤 simion314

Yes, sometimes you see some weird code and you have no idea why it was added, searching the history I can see the commit and the ticket related with that piece of code.
Also sometimes you want to find the author of a piece of code to ask more questions why some things was done a certain way.

👤 b15h0p

An underrated feature is “blame” in the IDE. IntelliJ or Eclipse both support showing the last commit a line was changed in in the “gutter” of the code editor.
Makes it easier to figure out how old a line of code is and (if the commit messages are any good) why it was introduced or changed.

👤 htns

A good UI is key. If your editor can navigate blame and commit history with just a key press, it actually speeds up figuring things out, especially in old codebases with bit rot. Benefiting from 10+ years of history might be rare, but two years feel like yesterday.

👤 Waterluvian

I used to get irritated that peers made me jump through hoops to collapse commits into singular meaningful commits. Why bother. Nobody looks at history.
And then I started looking at history and its invaluable to have particularly when understanding rationale or debugging issues.

👤 cube2222

There are numerous comments showing why git history is useful, but I think nobody mentioned this.
In Goland/idea you can look on the git history of only the selected code. I use this constantly to see how the code has been previously modified before I make my own changes.

👤 kmbriedis

There's another case of reverting the codebase to an earlier version just to run it

👤 SergeAx

Of course we do! Using blame/annotate every team member can get the idea who made the changes in question and why they did it. With 9 years old codebase of 400+kLOC (not counting blank lines and comments) this perk is invaluable.

👤 closeparen

I often wonder:
- What is this trying to accomplish?
- Why are you doing it this way?
- Why not this other way?
Ideally these things would be answered in comments, but they often aren't. The commit message hopefully answers #1, and it links to the code review tool which may shed light on the others.

👤 awinter-py

yes, I have a tool called automigrate that uses git history to generate and apply DB migrations
I also use history for git blame to understand when a change was introduced for debugging / intent purposes; this can go back months if not years

👤 devnonymous

Besides the obvious benefits of git annotate and git log, the ability to do a git bisect to isolate the exact commit that caused a regression is invaluable. So, to answer your question, yes we do get immense value out of it.

👤 golergka

Yes. Not frequently, but when sometimes git bisect saves my day. That's why I don't allow rebase or squash in my repos – code's history is valuable information, and I don't want to lose it.

👤 JaDogg

Git bisect solved a problem I had before. I had exhausted debugging and even println debugging, in the end I had to find out which commit introduced the bug. Once I found the change it was very easy to fix.

👤 phonebanshee

Sure, history has definitely been useful for me. But I've been involved in a few version control system changes, and as long as the old system continues to be available for reading, cutovers are fine.

👤 bootlooped

I most often use VCS history to figure out why some piece of code exists. This probably wouldn't be necessary if it were reasonably readable or properly commented by whoever wrote it.

👤 leni536

Just recently I found a 7 years old regression in our codebase, I could pinpoint the exact commit where it was introduced. We use mercurial, I find hg grep and bisect really useful.

👤 donatj

I have a tool I built that will link any line directly to it's pull request it was introduced in. That has helped me so much with "why the hell is this like this?"

👤 vincent-toups

Absolutely. Its shocking to me that any experienced software developer could even pose this question.
The sequence of diffs is much, much, more informative than the current state of the software.

👤 rplst8

Yes. Nearly everyday. And the overhead of migration is nearly fixed no matter the size of the history. The expensive part is writing the script to do the export/import.

👤 avip

Practically daily. Every bugfix or feature starts with looking at the already done mistakes engraved in the tree.
Though we stand on shoulders of midgets usually, you still get a better view.

👤 lscharen

Just yesterday I had to provide dates on a proposal for different projects that I had worked on over the past 12 years.
Being able to pop into our SVN history made this a trivial issue ask.

👤 elcapitan

Yes, absolutely. In VS Code I use the feature that shows git blame on the current line, which can be quite helpful in understanding code history and responsibility.

👤 hoorayimhelping

Heck yes! In fact, just a couple weeks ago, I used the commit history with git bisect to find the commit where a regression was introduced. Felt like wizardry!

👤 hboon

I use `git log -S` very often to read the commit message for figuring why a change happened.
A wonderful reason to create atomic commits with good commit messages.

👤 zzo38computer

I have never needed it so far, although I have it in case it is useful to me or someone else in future.

👤 donohoe

No. Nothing after 30-60 days. Always possible for very rare exception, but nothing in last 10 years

👤 masto

Don’t ask the lawyers. They’ll want your document retention policy to apply to source code...

👤 ryanthedev

It's like a seat belt. You are only happy to have it when shit his the fan. Lol.

👤 nottorp

Revision history is something you don't need... until you very badly need it.
Call it insurance.

👤 genezeta

At my current place they use SVN like they would use stones and sticks.
Most commit messages are only the code of the Jira issue and maybe its title, but almost never what they actually did or why. Frequently, they will have half a dozen commits with the same message -sometimes even unrelated commits because they got a bit too lazy-. Most Jira tasks don't have a description. If it's a new development, the documentation is generally elsewhere and the Jira task has no description at all. If it's a bug, it may have some screenshot attached, and it sometimes has an explanation but generally the explanation is given verbally to the developer.
A handful of developers heard The Architect say once that it's preferable to submit one commit for each changed file than to put two unrelated changes in the same commit, and so they do. They change 12 different files for a certain feature and they will make 12 separate commits, one file each. Not always one after the other but sometimes dispersed through the day. One or two developers obsessively commit each single change they do. Meaning they write a couple of lines of code, commit it, and then try it, see it wasn't correct -there was a typo, it wasn't the correct field they needed, whatever-, edit again, commit again, etc.
They have a certain backup process which stores a handful of XML log files from some processes; they store them by committing them to the SVN repo. A commit every hour, in the development branch.
They have a flow with two branches, trunk and development, and a 6 month cycle for releases... It sort of works this way:
Start (theoretical): People develop on development. Two -or two and a half- months before release, they make "the switch". Everybody commits whatever they are doing at the moment and stops for a day. They merge development into trunk. and then they all start working on trunk for the rest of the cycle until release.
In that final period, trunk is mostly "open" -more on this later- and people just commit to it and that's it. development is abandoned and deleted. A new development branch is taken from trunk but is not generally used during this period.
When release time comes, trunk is tagged with the version. Everybody switches back to the new development and development is done there. But this is not what happens because there's another period of maybe one or two months, where trunk -the released version- has a number of a. bugs, b. stuff that was unfinished, c. smaller things which "well, we could do it on trunk because it's just a small thing". So, what happens is they go on working on trunk for that month or two, and only gradually people start working on development.
Also, they don't really tag trunk at release time because it's not "done" yet. When the bug hunting season is over -or when they are just tired of it- then they tag and freeze trunk, with the version, move it into storage. Nothing in this is really planned. They just decide one day and then tell people, who just rush whatever they were doing on trunk and commit it, or abandon it and move to development.
During both pre-release and post-release periods, merges are done about once or twice a week from trunk to development. If you use SVN you'll know that these merges are seen as a single commit in the receiving branch. You can see the full history if you query the merge info, but it's not shown directly in the main "svn log".
All this means they have:
- about 40% automated commits from some backup process.
- Most changes happening in the other branch, so you need to go through mergeinfo several times.
- Main development branches deleted and created new every so often.
- Most people not explaining what they did in commit messages.
- About half of the bugs in Jira not describing the problem and almost all of the tasks not explaining the work to be done.
So... do we ever truly use the revision history?
Yes.
A few people -particularly Karen- use it to drop the blame on whoever they want. They get a bug, they open the svn log for something related, see a name they don't like much and say "Ok, just assign this to X, because they did something on that file 4 months ago".
I am using it, sometimes -with some effort and some success- to try to understand just where do some heavily copy-pasted snippets come from, so that I can wipe them out for good. Also, sometimes I use it just to write in my diary and laugh a bit about it so I don't cry so much when I get up in the morning. This is probably the most valuable thing we get out of it, because it keeps me... well, insane, but at least not murderly insane.

👤 ada1981

No, but I find revisionist history to serve me quite well.

👤 insulanian

Of course. It usually goes like this:
- Why the heck is this here?!
- git blame, git log, git show
- Ahh...

👤 rienbdj

there are tools which migrate history from svn to git

👤 slim

at my previous job (consultancy) we provided detailed bill that included literally the (slightly edited) git log for the month

👤 wdr1