Tool for finding repeated chunks of text across files?
I'm hoping to automatically find potential copypasta across a codebase (not just individual lines, but sequences of lines). I realize this is an exponential problem, though N won't be crazy-large so it should be tractable.
Anybody know of something like this?