http://qs321.pair.com?node_id=1199416

LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

Colleague asked me if I knew a method to compare modules where functions blocks where moved.

My first idea was to extract functions (probably with PPI or B::Xref ) and to diff them individually.

The next idea only to extract sums from the different blocks...

Than it occurred to me that a "semantic diff" could be a nice help for version control comments, and could also provide informations of other forms of refactoring. *

Any ready to use projects available?

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

*) found this in the mean time explaining the idea http://martinfowler.com/bliki/SemanticDiff.html

Replies are listed 'Best First'.
Re: semantic diff for Perl code
by talexb (Chancellor) on Sep 14, 2017 at 19:55 UTC

    The solution that comes to mind is to do a git diff on the github repository that houses the module's source code. This assumes that the module's code is stored on github, and that the revision numbers are tagged in the repository.

    Failing that, you may have to download and unpack the versions you want to compare in order to do a recursive diff on the two trees.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      This assumes that the module's code is stored on github,

      git diff will work on any git repository, regardless of where it is hosted; and, more generally, diff will work even if version control is not involved, if you can put the two source trees you want to compare side by side on the filesystem.

      I'm not sure this is really what was asked for, though. If a function block has been moved and possibly changed, diff mostly shows you that it was moved, and you need to actually look at the moved code line by line to see if it also changed.

        True .. I made the leap that we were talking about modules on CPAN, and that may have been a completely unsupported hypothesis.

        Since I've been using git for about ten years now (it seems that long), I now imagine git whenever someone talks about version control. And there's another logical leap.

        Alex / talexb / Toronto

        Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      >  This assumes that ...

      ... they used version control ... (cough ;)

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

Re: Semantic diff for Perl code
by Anonymous Monk on Sep 14, 2017 at 21:05 UTC
Re: Semantic diff for Perl code
by Anonymous Monk on Sep 15, 2017 at 14:57 UTC
    The most clever strategy I have seen was also based on version-control – MS Source Safe – and it worked by analyzing the deltas. It looked for references to a subroutine-name within the blocks of changed text. It also examined the human comments which accompanied each revision ... what the developer said about it ... and the number of times a particular block of source code was changed within a small amount of time. There was some "smarts" about calculating how a particular line-number might have changed due to insertions and deletions that surrounded it. But this code was all being done in-house (by the developers at one cell-phone company that had acquired another one), and I never knew of anything to be released.