koknat has asked for the wisdom of the Perl Monks concerning the following question:
PerlMonks,
I would like to clean up a large Perl program by removing any unused subroutines. Has anyone already automated a solution?
Thanks,
- Chris
Re: Cleaning up unused subroutines
by dragonchild (Archbishop) on Oct 26, 2007 at 02:23 UTC
|
- Get your test suite up to 95% coverage.
- Pick a subroutine you think isn't used anymore and rename it.
- If your test suite still passes, it isn't being used. If it fails, you'll see how.
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |
Re: Cleaning up unused subroutines
by toolic (Bishop) on Oct 26, 2007 at 03:15 UTC
|
| [reply] |
|
Thanks for posting that link, toolic.
I hadn't even considered anyonymous subroutines, or objects, since I've been just calling simple normal subroutines.
In my simple case, I think a script could be written to:
1) Find all subroutines by doing a grep for ^\s*sub\s+\S+
2) Find all subroutines called in the main program.
3) Recursively follow every subroutine used, and look for subroutines within subroutines.
4) At the end, you have a list of every subroutine, and a list of every subroutine used
The hardest part may be deciding what a simple subroutine call looks like:
&mysub
&mysub(args)
mysub
mysub(args)
| [reply] |
|
Sounds good, but what about the generated code + eval?
| [reply] |
Re: Cleaning up unused subroutines
by BrowserUk (Patriarch) on Oct 26, 2007 at 02:54 UTC
|
perl -ple"s[sub\s+(?=\w)][sub XXX_]g" yourscript.pl >junk.pl
Run junk.pl and correct each Undefined subroutine &..... called at ... by removing the XXX_ from the subroutine declarations. Once the script runs again, any sub definitions still carrying the XXX_ prefix are redundant and can safely be removed from yourscript.pl
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
| [reply] |
|
Hm. I'm not for one moment going to suggest that you are wrong. If the code uses prototypes on one or more of the subs, then the regex would need modification. Likewise, as I coded it, it would ignore anonymous subs.
But, for the sake of completeness, could you describe what other situations might not be covered?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
|
|
Re: Cleaning up unused subroutines
by tuxz0r (Pilgrim) on Oct 26, 2007 at 01:58 UTC
|
Judicious use of grep is about as good as it gets. One to grab your subroutine names, another to find occurences of the "calls" to those subs (or lack of calls).
As the previous comment noted, you'll have to suit the solution to your particular code. And, removing calls to subroutines can sometimes create artifacts, especially if those subroutines returned values or took paramaters, so - be careful.
---
echo S 1 [ Y V U | perl -ane 'print reverse map { $_ = chr(ord($_)-1) } @F;'
| [reply] |
Re: Cleaning up unused subroutines
by roboticus (Chancellor) on Oct 26, 2007 at 03:57 UTC
|
My favorite technique is to write a quick script to find all the subroutines and insert a print "\n..subname\n"; statement as the first line. (The double dot at the start of the line makes it easy to grep for, as I don't start any other messages with it.) Then I go and do other things with the code, like fixing a few bugs, or tuning a feature. And each time I run the code for a test, I remove the print statements that I've seen. Not necessarily all of them, because that can be tedious, just the ones that "annoy" me the most. (Usually the most frequent ones!)
Then after a month or so, I rename all the functions that still have the print in them. (I normally prefix them with a 'z' so they sort to the end of the list. Easy to find 'em if I want 'em, and easy to ignore them when I don't want to see 'em.)
...roboticus | [reply] [d/l] |
Re: Cleaning up unused subroutines
by aquarium (Curate) on Oct 26, 2007 at 01:41 UTC
|
| [reply] |
Re: Cleaning up unused subroutines
by dmorgo (Pilgrim) on Oct 27, 2007 at 09:29 UTC
|
So glad you asked. I was just thinking about this the other day, and wrote this script called... wait for it...
"subvirgin"
subvirgin finds virgin subroutines. Well, more accurately it lets you find virgin subroutines by a process of elimination, because it logs each subroutine as its used. You still have to do the work of figuring out which subroutines were not used. OK, so there's some room for improvement there, I admit, but still, it's pretty nifty if I do say so myself. One more caveat: it won't do a great job unless you can, through your own ingenuity and luck, get the code you are examining to run through most of its possible execution paths.
All it does is insert some code at the top of each subroutine, using an appropriately-named subroutine, "insert_instrument()" ... cough. This code calls a small SVGN.pm module that logs the name of each subroutine as it runs.
In case it's not obvious, the main value of this tool would be in situations where you are faced with a large, unfamiliar Perl code base and have been told that some of the code is old, orphaned code that is no longer used, and you want to determine which code is still used and which is not (though for the negatives, it won't be with 100% reliability, unfortunately).
You can add your own code to the SVGN module shown at the bottom, so it could do fancier stuff, like, for example, printing the subroutine's caller and arguments. But for now I'll leave that as an exercise for the reader.
Here's the code. Oh, and it doesn't do anonymous or nested subroutines at the moment... sorry about that. Should be easy to fix.
**** WARNING: This has not been extensively tested, and more to the point, IT WILL DO A DESTRUCTIVE UPDATE OF ALL PERL FILES IT FINDS IN THE CURRENT DIRECTORY so use it carefully and work on a copy of your code. ***
#!/usr/bin/perl
#
# subvirgin v0.42
#
use strict;
use warnings;
# TODO: matches, but does not capture, prototypes and attributes
# determine whether this matters, and, if so, add it
die "\n\nWARNING: this program has not been extensively tested, and it
+ will do a destructive update of any .pl, .pm, and .cgi files it find
+s in the current directory. If you are sure you want to run this prog
+ram, work in a special directory containing an extra copy of the perl
+ files you want to instrument, and remove this line before running th
+e program.\n\n";
opendir(DIR, ".") or die "error: $!";
my @files = grep(!/^(subvirgin\.pl|SVGN\.pm)$/,
grep(/\.(?:pl|cgi|pm)$/, readdir(DIR)));
closedir(DIR);
foreach my $file (@files) {
insert_instrument($file);
}
sub insert_instrument {
my $file = shift;
my ($mode,$atime,$mtime) = (stat($file))[2,8,9];
open(IN, $file) or die "error: $!";
open(OUT, ">$file.tmp") or die "error: $!";
my $sub_name = '';
my $at_start = 1;
my $package = '';
while (my $line=<IN>) {
chomp($line);
next if ($line =~ /^use SVGN;$/);
if ($line =~ m{^package\s+(.*);\s*$}) {
$package = $1 . '::';
print OUT "$line\n";
}
elsif ($line =~ m{^\s*1;\s*$}) {
$package = '';
print OUT "$line\n";
}
elsif ($line =~ /^\s*(#.*)?$/) {
print OUT "$line\n";
}
elsif ($at_start) {
print OUT "use SVGN;\n";
print OUT "$line\n";
$at_start = 0;
}
elsif ($sub_name ne '') {
next if ($line =~ /^\s*SVGN::doit/);
my $indent = "";
if ($line =~ /^(\s*)/) {
$indent = $1;
}
print OUT "${indent}SVGN::doit(\"$package$sub_name\");\n";
print OUT "$line\n";
$sub_name = '';
}
elsif ($line =~
m{^
( # everything up to open curly
(\s*) # optional leading space
sub\s+ # sub declaration
(\S+)\s* # name of subroutine
(?:\([^)]*?\)\s*)? # optional prototype
(?:\s*:\s*\S+\s*)? # optional attributes
\{ # open subroutine block
)
\s* # nuke any space before capture
(\S.*)? # catch one-line subroutines
$}x
) {
$sub_name = $3;
if (defined($4) && $4 ne '') { # for one-line subroutines
my $body = $4;
$line = "$1 SVGN::doit(\"$package$sub_name\"); $4";
print OUT "$line\n";
$sub_name = '';
}
else {
print OUT "$line\n";
}
}
elsif ($line !~ m{^(\s*)SVGN::doit\(\S+\)$}) {
print OUT "$line\n";
}
}
close(OUT);
unlink($file);
rename($file, "$file.old");
rename("$file.tmp", $file);
utime $atime, $mtime, $file;
chmod $mode & 07777, $file;
}
And here's the skeleton for SVGN.pm
package SVGN;
use strict;
sub doit {
my $name = shift;
open(LOG, ">>log.txt") or die "error: $!";
print LOG "$name\n";
close(LOG);
}
1;
| [reply] [d/l] [select] |
Re: Cleaning up unused subroutines
by burningdog (Scribe) on Oct 27, 2007 at 14:20 UTC
|
Just personal preference here but instead of using grep, regexes etc check out PPI "PPI - Parse, Analyze and Manipulate Perl (without perl)". I have used it before to get an inventory of some code that I wasn't familiar with and it works well. Plus if you decide later you want to find more information like variables and what not it's pretty easy to filter out new items.
| [reply] |
Re: Cleaning up unused subroutines
by neszt76 (Novice) on Sep 19, 2018 at 08:02 UTC
|
Oneliner that count occurrences of sub names in a file:
FILE=foo.pl ; for i in `cat $FILE |grep -E "^sub " | sed 's/^sub.\([A-
+Za-z0-9_]*\).*$/\1/'` ; do c=`grep -c $i $FILE` ; echo $c $i ; done |
+ sort -rn
| [reply] [d/l] |
|
perl -nE '$f.=$_}{$t{$_}=()=$f=~/\Q$_/g for $f=~/^sub\s+(\w+)/gm;say "
+$t{$_} $_" for sort {$t{$b} <=> $t{$a}} keys %t;' $FILE
This uses precisely* the same logic as your shell loop above (and therefore has the same algorithmic flaws). Nonetheless, advantages of the Perl approach include:
- No multiple forks to grep (one for each sub found)
- Works on multiple input files at once just by appending more to the argument list
- Works on systems without sh, sed, grep, et al.
- Looks like line noise so your cow-orkers will be amazed you can read it let alone write it
- No UUOCA ;-)
Enjoy.
*Not precisely: there's one trivial fix to avoid lines which start with eg. "submarine". | [reply] [d/l] |
Re: Cleaning up unused subroutines
by kelpless (Initiate) on Mar 11, 2011 at 00:04 UTC
|
#!/bin/bash
cloc --strip-comments=nc $1 > /dev/null 2>&1
if [ -e "$1.nc" ]
then
subs=`grep -E "^sub " $1.nc | cut -d' ' -f2 | sort -u`
res=();
for sub in ${subs}
do
grep ${sub} $1.nc | grep -vq "sub ${sub}"
if [ "$?" == "1" ]
then
res=("${res[@]}" "${sub}")
fi
done
if [ ${#res[@]} == 0 ]
then
echo All subroutines in $1 appear to be used
else
echo $1 apparently unused:
for sub in ${res[@]}
do
echo " $sub"
done
fi
rm $1.nc
else
echo "failed to strip comments from $1"
fi
This is only 'apparently' unused subroutines because of calls that can be made in the way others have pointed out that this simple grep will miss. The cloc program cleans up a lot of false negatives by removing comments that may include subroutine names. Also, this assumes you have 'sub' lines that start at the first column - use perltidy.
| [reply] [d/l] |
|
|