Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Cleaning up unused subroutines

by koknat (Sexton)
on Oct 25, 2007 at 23:37 UTC ( [id://647295]=perlquestion: print w/replies, xml ) Need Help??

koknat has asked for the wisdom of the Perl Monks concerning the following question:

PerlMonks,

I would like to clean up a large Perl program by removing any unused subroutines. Has anyone already automated a solution?

Thanks,

- Chris

Replies are listed 'Best First'.
Re: Cleaning up unused subroutines
by dragonchild (Archbishop) on Oct 26, 2007 at 02:23 UTC
    1. Get your test suite up to 95% coverage.
    2. Pick a subroutine you think isn't used anymore and rename it.
    3. If your test suite still passes, it isn't being used. If it fails, you'll see how.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Cleaning up unused subroutines
by toolic (Bishop) on Oct 26, 2007 at 03:15 UTC
      Thanks for posting that link, toolic.
      I hadn't even considered anyonymous subroutines, or objects, since I've been just calling simple normal subroutines.

      In my simple case, I think a script could be written to:
      1) Find all subroutines by doing a grep for ^\s*sub\s+\S+
      2) Find all subroutines called in the main program.
      3) Recursively follow every subroutine used, and look for subroutines within subroutines.
      4) At the end, you have a list of every subroutine, and a list of every subroutine used

      The hardest part may be deciding what a simple subroutine call looks like:
      &mysub
      &mysub(args)
      mysub
      mysub(args)
        Sounds good, but what about the generated code + eval?
Re: Cleaning up unused subroutines
by BrowserUk (Patriarch) on Oct 26, 2007 at 02:54 UTC

    perl -ple"s[sub\s+(?=\w)][sub XXX_]g" yourscript.pl >junk.pl

    Run junk.pl and correct each Undefined subroutine &..... called at ... by removing the XXX_ from the subroutine declarations. Once the script runs again, any sub definitions still carrying the XXX_ prefix are redundant and can safely be removed from yourscript.pl


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      BrowserUK -- as noted earlier, this doesn't produce 100% coverage in non-trivial code. Certainly not for code that gets released to a customer for a production system anyway. Good thinking outside the square.
      the hardest line to type correctly is: stty erase ^H

        Hm. I'm not for one moment going to suggest that you are wrong. If the code uses prototypes on one or more of the subs, then the regex would need modification. Likewise, as I coded it, it would ignore anonymous subs.

        But, for the sake of completeness, could you describe what other situations might not be covered?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Cleaning up unused subroutines
by tuxz0r (Pilgrim) on Oct 26, 2007 at 01:58 UTC
    Judicious use of grep is about as good as it gets. One to grab your subroutine names, another to find occurences of the "calls" to those subs (or lack of calls).

    As the previous comment noted, you'll have to suit the solution to your particular code. And, removing calls to subroutines can sometimes create artifacts, especially if those subroutines returned values or took paramaters, so - be careful.

    ---
    echo S 1 [ Y V U | perl -ane 'print reverse map { $_ = chr(ord($_)-1) } @F;'

Re: Cleaning up unused subroutines
by roboticus (Chancellor) on Oct 26, 2007 at 03:57 UTC
    My favorite technique is to write a quick script to find all the subroutines and insert a print "\n..subname\n"; statement as the first line. (The double dot at the start of the line makes it easy to grep for, as I don't start any other messages with it.) Then I go and do other things with the code, like fixing a few bugs, or tuning a feature. And each time I run the code for a test, I remove the print statements that I've seen. Not necessarily all of them, because that can be tedious, just the ones that "annoy" me the most. (Usually the most frequent ones!)

    Then after a month or so, I rename all the functions that still have the print in them. (I normally prefix them with a 'z' so they sort to the end of the list. Easy to find 'em if I want 'em, and easy to ignore them when I don't want to see 'em.)

    ...roboticus

Re: Cleaning up unused subroutines
by aquarium (Curate) on Oct 26, 2007 at 01:41 UTC
    i suggest you devise a solution that suits your particular code....as in general there's no way of producing a call hierarchy for perl programs by script inspection alone, and even particular runs may not call certain functions depending on program input.
    use strict and similar may help somewhat...but no guarantees.
    as a (very) rough guide, count the occurences that grep found for the function name in the code
    the hardest line to type correctly is: stty erase ^H
Re: Cleaning up unused subroutines
by dmorgo (Pilgrim) on Oct 27, 2007 at 09:29 UTC
    So glad you asked. I was just thinking about this the other day, and wrote this script called... wait for it...

    "subvirgin"

    subvirgin finds virgin subroutines. Well, more accurately it lets you find virgin subroutines by a process of elimination, because it logs each subroutine as its used. You still have to do the work of figuring out which subroutines were not used. OK, so there's some room for improvement there, I admit, but still, it's pretty nifty if I do say so myself. One more caveat: it won't do a great job unless you can, through your own ingenuity and luck, get the code you are examining to run through most of its possible execution paths.

    All it does is insert some code at the top of each subroutine, using an appropriately-named subroutine, "insert_instrument()" ... cough. This code calls a small SVGN.pm module that logs the name of each subroutine as it runs.

    In case it's not obvious, the main value of this tool would be in situations where you are faced with a large, unfamiliar Perl code base and have been told that some of the code is old, orphaned code that is no longer used, and you want to determine which code is still used and which is not (though for the negatives, it won't be with 100% reliability, unfortunately).

    You can add your own code to the SVGN module shown at the bottom, so it could do fancier stuff, like, for example, printing the subroutine's caller and arguments. But for now I'll leave that as an exercise for the reader.

    Here's the code. Oh, and it doesn't do anonymous or nested subroutines at the moment... sorry about that. Should be easy to fix.

    **** WARNING: This has not been extensively tested, and more to the point, IT WILL DO A DESTRUCTIVE UPDATE OF ALL PERL FILES IT FINDS IN THE CURRENT DIRECTORY so use it carefully and work on a copy of your code. ***

    #!/usr/bin/perl # # subvirgin v0.42 # use strict; use warnings; # TODO: matches, but does not capture, prototypes and attributes # determine whether this matters, and, if so, add it die "\n\nWARNING: this program has not been extensively tested, and it + will do a destructive update of any .pl, .pm, and .cgi files it find +s in the current directory. If you are sure you want to run this prog +ram, work in a special directory containing an extra copy of the perl + files you want to instrument, and remove this line before running th +e program.\n\n"; opendir(DIR, ".") or die "error: $!"; my @files = grep(!/^(subvirgin\.pl|SVGN\.pm)$/, grep(/\.(?:pl|cgi|pm)$/, readdir(DIR))); closedir(DIR); foreach my $file (@files) { insert_instrument($file); } sub insert_instrument { my $file = shift; my ($mode,$atime,$mtime) = (stat($file))[2,8,9]; open(IN, $file) or die "error: $!"; open(OUT, ">$file.tmp") or die "error: $!"; my $sub_name = ''; my $at_start = 1; my $package = ''; while (my $line=<IN>) { chomp($line); next if ($line =~ /^use SVGN;$/); if ($line =~ m{^package\s+(.*);\s*$}) { $package = $1 . '::'; print OUT "$line\n"; } elsif ($line =~ m{^\s*1;\s*$}) { $package = ''; print OUT "$line\n"; } elsif ($line =~ /^\s*(#.*)?$/) { print OUT "$line\n"; } elsif ($at_start) { print OUT "use SVGN;\n"; print OUT "$line\n"; $at_start = 0; } elsif ($sub_name ne '') { next if ($line =~ /^\s*SVGN::doit/); my $indent = ""; if ($line =~ /^(\s*)/) { $indent = $1; } print OUT "${indent}SVGN::doit(\"$package$sub_name\");\n"; print OUT "$line\n"; $sub_name = ''; } elsif ($line =~ m{^ ( # everything up to open curly (\s*) # optional leading space sub\s+ # sub declaration (\S+)\s* # name of subroutine (?:\([^)]*?\)\s*)? # optional prototype (?:\s*:\s*\S+\s*)? # optional attributes \{ # open subroutine block ) \s* # nuke any space before capture (\S.*)? # catch one-line subroutines $}x ) { $sub_name = $3; if (defined($4) && $4 ne '') { # for one-line subroutines my $body = $4; $line = "$1 SVGN::doit(\"$package$sub_name\"); $4"; print OUT "$line\n"; $sub_name = ''; } else { print OUT "$line\n"; } } elsif ($line !~ m{^(\s*)SVGN::doit\(\S+\)$}) { print OUT "$line\n"; } } close(OUT); unlink($file); rename($file, "$file.old"); rename("$file.tmp", $file); utime $atime, $mtime, $file; chmod $mode & 07777, $file; }
    And here's the skeleton for SVGN.pm
    package SVGN; use strict; sub doit { my $name = shift; open(LOG, ">>log.txt") or die "error: $!"; print LOG "$name\n"; close(LOG); } 1;
Re: Cleaning up unused subroutines
by burningdog (Scribe) on Oct 27, 2007 at 14:20 UTC
    Just personal preference here but instead of using grep, regexes etc check out PPI "PPI - Parse, Analyze and Manipulate Perl (without perl)". I have used it before to get an inventory of some code that I wasn't familiar with and it works well. Plus if you decide later you want to find more information like variables and what not it's pretty easy to filter out new items.
Re: Cleaning up unused subroutines
by neszt76 (Novice) on Sep 19, 2018 at 08:02 UTC
    Oneliner that count occurrences of sub names in a file:
    FILE=foo.pl ; for i in `cat $FILE |grep -E "^sub " | sed 's/^sub.\([A- +Za-z0-9_]*\).*$/\1/'` ; do c=`grep -c $i $FILE` ; echo $c $i ; done | + sort -rn

      Hello neszt76 and welcome to PerlMonks. We use Perl here.

      perl -nE '$f.=$_}{$t{$_}=()=$f=~/\Q$_/g for $f=~/^sub\s+(\w+)/gm;say " +$t{$_} $_" for sort {$t{$b} <=> $t{$a}} keys %t;' $FILE

      This uses precisely* the same logic as your shell loop above (and therefore has the same algorithmic flaws). Nonetheless, advantages of the Perl approach include:

      • No multiple forks to grep (one for each sub found)
      • Works on multiple input files at once just by appending more to the argument list
      • Works on systems without sh, sed, grep, et al.
      • Looks like line noise so your cow-orkers will be amazed you can read it let alone write it
      • No UUOCA ;-)

      Enjoy.

      *Not precisely: there's one trivial fix to avoid lines which start with eg. "submarine".

Re: Cleaning up unused subroutines
by kelpless (Initiate) on Mar 11, 2011 at 00:04 UTC

    I use cloc (cloc.sf.net) in a bash script I call 'apparently_unused'

    #!/bin/bash cloc --strip-comments=nc $1 > /dev/null 2>&1 if [ -e "$1.nc" ] then subs=`grep -E "^sub " $1.nc | cut -d' ' -f2 | sort -u` res=(); for sub in ${subs} do grep ${sub} $1.nc | grep -vq "sub ${sub}" if [ "$?" == "1" ] then res=("${res[@]}" "${sub}") fi done if [ ${#res[@]} == 0 ] then echo All subroutines in $1 appear to be used else echo $1 apparently unused: for sub in ${res[@]} do echo " $sub" done fi rm $1.nc else echo "failed to strip comments from $1" fi

    This is only 'apparently' unused subroutines because of calls that can be made in the way others have pointed out that this simple grep will miss. The cloc program cleans up a lot of false negatives by removing comments that may include subroutine names. Also, this assumes you have 'sub' lines that start at the first column - use perltidy.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://647295]
Approved by GrandFather
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-04-19 21:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found