|No such thing as a small change|
Re: Searching for duplication in legacy code (refactoring strategy)by LanX (Cardinal)
|on Nov 23, 2016 at 11:19 UTC||Need Help??|
It depends on the nature of the duplication.
Do equally named subs have identical code?
Cut&paste programing involves mutations.
General approach for refactoring
a) identify all sub definitions in a file
b) identify their dependencies
c) normalize sub code
Formatting can differ
d) diff potentially equal subs to measure similiarity
What "potentially" means depends on the quality your code.
probably changes happened to
e) try to visualize dependencies to decide where best to start
like with grapviz or a tree structure
f) create a test suite to assure refactoring quality
g) start refactoring incrementally, while constantly testing the out come
depending on the quality of your tests you might first start with only one demon in production.
h) care about a fall back scenario
Especially use version control!
Sorry, very general tips, because it really depends on the structure of your legacy code. Probably grep is already enough...
(Think about it, you might also need "nested refactoring" because new modules still have duplicated code and need using other modules and so on)
I did some googling yesterday after our conversation for "refactoring" and "duplication" and the term "plagiarism detection" popped up.
like in these discussions:
Couldn't find a general refactoring project for Perl, but also didn't spend much time yet.
I think to cover all edge cases of a worst case scenario one certainly would need the use of PPI ( at least) or even a patched B::Deparse to scan the Op-Tree with PadWalker to identify variable dependencies and side effects.