Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

The dreaded if-elsif-else construct (code)

by deprecated (Priest)
on Nov 17, 2001 at 01:05 UTC ( [id://125918]=perlquestion: print w/replies, xml ) Need Help??

deprecated has asked for the wisdom of the Perl Monks concerning the following question:

Scenario:

I have 36 directories. Inside those 36 directories are some ten thousand additional directories. Inside each of those sub directories are a set of files. These files are regularly named. They reflect information stored elsewhere on the disk. It is my task to take the data in the sub directories and reflect them in our database.

I've been doing this with a copy of a script that the original planner of this system had used to convert the data to html.

The reason I'm looking to change it is a) I didnt write it, and its difficult to work with and b) its a 300-line-long if/elsif/else "loop." In other words, its basically three opendirs, with the last one goes over every file, compares it to a regular expression, and then modifies a corresponding variable. that variable is then stored in the database.

an example:

/data/sprocket_logs/spacely.200 /data/sprocket_logs/cogswell.100 /data/cog_logs/cogswell.1000 /data/cog_logs/astro.1
we could interpret this to mean that spacely has 200 sprockets, and that cogswell has 100 sprockets and 1000 cogs, while astro has 1 cog. However, this is not the only kind of information stored in these directories. for example we might see:
/data/cog_logs/maimed_a_robot
indicating that at one point a cog was responsible for a robot maiming. now I know what all these files are going to look like, and have a regular expression for the particular parsing of each of them. however, this if/elsif/else construct is irritating the heck out of me. It is not the right way to do this.

However, when i started coming up with a solution that was more comprehensive, I came up with this:

my %dispatcher => ( \$quux => [ sub { return 0 unless $_[0] =~ ${$_[1]}; ${$_[3]} =~ s/${$_[1]}/${$_[2]}/; return $_[3]; }, qr{bar}, qq{foo}, ], ); $dispatcher{$key} -> [0] -> ($file, $dispatcher{$key} -> [0] -> [1], $dispatcher{$key} -> [0 +] -> [1]);
What this basically does is, (and note the syntax may be off, it's not yet gone past perl -c) is there is a hash with the variables i want to change. each variable has a corresponding listref value which contains the sub that actually changes its value, and the pattern it needs to match for the change to occur, and what to change it to. so, i think its actually even a little elegant. however, it takes up more space and is less maintainable than the "dreaded" if-elsif-else construct.

There has to be a better way than three hundred lines of if statements.

What have you all done when faced with this situation?

thanks,
brother dep.

--
Laziness, Impatience, Hubris, and Generosity.

Replies are listed 'Best First'.
Re: The dreaded if-elsif-else construct (code)
by dws (Chancellor) on Nov 17, 2001 at 01:33 UTC
    This type of data-drive setup can drive a less-skilled maintenance programmer bonkers. I'm moderately skilled at Perl, and it took me several minutes, and most of the fingers on one hand, to puzzle out what you were trying to do.

    Let me suggest an alternate approach. First get rid of the opendir calls by delegating to File::Find.

    use File::Find; find(\&eachfile, "/data/");
    and then structure the callback along the lines of
    sub eachfile { my $path = $File::Find::name; return if -d $path; $path =~ s|^.*/(.+)_logs/|| or return; # or log my $item = $1; if ( $path =~ m|(.+)\.(\d+)$| ) { my($company,$quantity) = ($1,$2); inventory($company, $item, $quantity); return; } # and so on for each remaining case # eventually logging any that you can't handle }
    That's the rough idea, at least -- it needs some cleanup and error handling.

    This would seem to me to be a very easy thing for whoever comes along after you to understand and maintain. The most they would need to understand is File::Find, and that's a low hurdle.

    Granted, this leaves you with the equivalent of a string if/elseif*/else, but it seems easy enough to follow.

      One thing that could extend this approach and possibly eliminate the large quantity of ifs is to templatize the conditional.

      Set it up so that the test and actions are defined in a config file and work via a generic test/action method.

      Granted the performance will probably drop, but the maintainability should increase.

      There also might be modules that support this sort of templating of a function, but i haven't seen them yet. (as i haven't had the need yet)

Re: The dreaded if-elsif-else construct (code)
by Masem (Monsignor) on Nov 17, 2001 at 01:24 UTC
    The Switch module is probably what could help here, though you are basically emulating a giant if-elsif-else statement with that.

    Alternatively, assuming that I understand the problem right, you can create an array of hashes as shown below:

    my @actions = ( { match => sub { $_[0] =~ /^(\w*)\.(\d*)$/; }, type => "inventory", action => &add_inventory; }, { match => sub { $_[0] =~ /^maimed(\w*)$/; }, type => "maimed", action => &handle_maimed; }, { match => sub { $_[0] }, type => "default", action => sub { print "I cannot handle ", $_[0]; } } ); # much later foreach my $file ( @filelist ) { foreach my $action ( @actions ) { if ( my @parts = &{ $action->{ match } }( $file ) ) { print "Am doing $action->{ type } on $file\n"; &{ $action->{ action } }( @parts ); last; } } }
    This avoids large if or switch blocks, and allows you to wrap code a bit better with a better explaination in the code to what is going on with each possible file type.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

      Seemingly minor changes to your code can break code that uses Switch. I can't recommend anyone use it in production code.

              - tye (but my friends call me "Tye")
        Agreed. It was only ever intended to be a proof-of-concept for Perl 6, and isn't at all ruggedized against the exigencies of the production environment.

        However, it will be core in 5.8 and I've been working on toughening it up for that. You may find it is now far more reliable than it used to be.

        In addition, the next release will be layered over Filter::Simple, rather than Filter::Util::Call, and that will definitely make it more robust.

Re: The dreaded if-elsif-else construct (code)
by drinkd (Pilgrim) on Nov 17, 2001 at 17:50 UTC
    This is a very common problem in scientific data processing now, where everybody sells instruments that make thousands and millions data directories with id information in the directory name and files underneath called "data.dat" and "data.log". Go Figure.

    I think dws above is on the right track. I am sure things are a little more complicated than you alude to, so use File::Find to suck the whole directory structure into a hash, and then use Parse::RecDescent to step through with a simple grammar and do different stuff depending on what it finds. That way next year when the structure (or database) gets slightly more complicated (it always does!), it just takes a slight grammar change. Good luck.

    drinkd

Re: The dreaded if-elsif-else construct (code)
by traveler (Parson) on Nov 17, 2001 at 03:14 UTC
    I'm probably not completely clear on what you have to do, but this (not 100% perl) solution may give you a hint or two:
    use diagnostics; use strict; use Data::Dumper; my %sprockets; my %cogs; open FILES, "c:/cygwin/bin/find.exe data -type f -print|" or die "no +data"; while(<FILES>){ m[sprocket_logs/(\w+)\.(\d+)] and $sprockets{$1} = $2; m[cog_logs/(\w+)\.(\d+)] and $cogs{$1} = $2; } print Dumper(%sprockets, %cogs);
    The basic idea here is a two column table with the REs and tasks, separated by "and"s. You could (clearly) make it a 100% Pure Perl solution by writing your own find. It may not solve your problem directly, but it might give you at least a partial idea toward a solution.

    HTH, --traveler

Re: The dreaded if-elsif-else construct (code)
by toma (Vicar) on Nov 18, 2001 at 01:42 UTC
    I recommend converting your directory structure to an XML file. Then you can apply any one of a number of excellent XML perl modules to the problem.

    A hierarchical data structure is a natural for an XML document. For your case it could look like this:

    <data> <sprocket_logs> <spacely>200</spacely> </sprocket_logs> <cog_logs> <cogswell>100</cogswell> <astro>1</astro> <maimed_a_robot /> </cog_logs> </data>
    Producing this file should be relatively easy using File::Find to load up a data structure and one of the XML modules for dumping the data structure.

    Then you have many choices for eliminating your big switch statement. Here are a few ideas that came to mind; there are better ideas, also:

    • Access your XML as if it were a database using DBD::AnyData.

    • Create a multilevel hash from the XML with XML::Simple or one of the SAX modules. Make a hash of subs that perform each action that you would have in your giant switch statement. Dispatch these subs as you loop through the multilevel hash.
    • Use XSLT to replace the perl code that generates the existing HTML view of your directory structure.
    The XML approach will be easier to maintain because the appropriate XML tools are in wide usage and worth learning anyway.

    It should work perfectly the first time! - toma

Re: The dreaded if-elsif-else construct (code)
by atlantageek (Monk) on Nov 18, 2001 at 05:43 UTC
    So in your solution, don't you have to check each entry in the hash to test for pattern match. This seems kind of strange. Does it have to be a full regular expression to match. can you use the first four characters or something as a key into the hash. Just thinking out loud here. The first and second directory, do they have any effect on what variables are changed. can you create a tree of hashes (hashes of hashes) representing categories of changes so you can get rid of looping through all 100 or so patterns. ---- I always wanted to be somebody... I guess I should have been more specific.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://125918]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-19 05:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found