HACKER Q&A
📣 remolacha

Is There a LeetCode for Refactoring?


I’m looking for some samples of realistic messy code (preferably with some sample well-refactored solutions) so I can practice refactoring legacy code, applying functional programming principles, etc. Anyone know of a resource for something like this?


  👤 dave_sid Accepted Answer ✓
Most code bases in most companies.

👤 chrisbrauns
I don't have a full 'LeetCode' style example, but the Gilded Rose Kata teaches me something every time I try it.

Original Kata: https://iamnotmyself.com/2011/02/14/refactor-this-the-gilded... Sandi Metz's Take: https://www.youtube.com/watch?v=8bZh5LMaSmE


👤 tucif
If having the solutions is not a must, you could take a dive in trending oss repos. (https://github.com/trending)

I’d say to look for high coverage pieces, so that you prove your refactor is not breaking things.

Then ask for code review, maybe in something like https://codereview.stackexchange.com/


👤 typedef_struct
Your own code from 6 months ago usually works.

👤 miguendes
I was thinking about this a couple of weeks ago. The difference would be in the format. I am considering creating a repo with refactoring katas. The most difficult part is to come up with examples that show up in real codebases.

👤 shoo
there isn't a single way to refactor code, code can be refactored in many different directions to pursue many different objectives.

Here's a few arbitrary ideas:

* Scrape_it is an arbitrary python project that shows up in google search for "pypi scrape" [1]. Looking at the code, it doesn't appear to have any automated tests. Goal: write a single unit test for the example of usage shown in the readme. The unit test should work without any dependency on : network access to remote websites or other services running locally, the current time, data files in the filesystem. Refactor by making minimal safe changes to the code as/if required in order to write a unit test.

* scipy's vendored copy of the useful L-BFGS-B optimisation algorithm [2] has a bunch of calls to a timer(ttime) subroutine, but the timer(ttime) subroutine was patched so it doesn't do anything. But there's still many calls to timer(ttime) throughout the lbfgsb.f code. Goal: simplify the code by identifying if any remaining variables and code related to cpu timing within lbfgsb.f can be safely removed without changing the behaviour of the algorithm. If it can be safely removed, remove it. Tricky: demonstrate with some degree of confidence that your change is safe and doesn't break behaviour!

* fixCache, featured on HN a few days ago [3], includes a rule to decide if a git commit is a "fix commit" based on user-supplied keyboards identified in the commit message. Goal: modify fixCache to make this functionality more flexible, so a user can still configure fixCache to use the current behaviour, but can also optionally configure fixCache to call a custom user-defined javascript function that can decide if a commit is a "fix commit" or not using all information available about the commit (e.g. which files were touched, the contents of the changes to those files, etc) instead of a list of keywords. Refactor as/if necessary to implement this feature.

Further reading: https://testing.googleblog.com/2008/07/how-to-write-3v1l-unt... , Michael Feathers' book "working effectively with legacy code": https://ptgmedia.pearsoncmg.com/images/9780131177055/samplep...

[1] https://pypi.org/project/scrape-it/ [2] https://github.com/scipy/scipy/tree/f865e9ed01be30c52f2b3841... [3] https://github.com/aavshr/fixCache