Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Learning Exercise

by FoncÚ (Scribe)
on Jul 18, 2002 at 12:28 UTC ( [id://182774] : perlquestion . print w/replies, xml ) Need Help??

FoncÚ has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing an address book (simple; just names and email addresses) both for my amusement and to learn more Perl. I've fought with this for a few hours and decided to seek professional help: I'm able to enter new entries, as the code will show, but I can't figure out how to remove entries from the outside file, called addybook.adb. What's the simplest, but reasonably effective way to do this? The code is as follows:

#!/usr/bin/perl -w use strict; # Declare yon variables my $ent; my $line; my @entry; my @lname; my @fname; my @email; my $addfname; my $addlname; my $addemail; my $addstring; my @addtemp; my $t_entry; my $option; my $x; $x = 1; while ($x == 1) { system "cls"; print "\nAddress Book\n------------\n"; print "Options:\nL for list\nA for add entry\nR for remove entry\n +S for search\nQ to quit\n"; chomp ($option = <STDIN>); if ($option eq "l") { &entrylist; } elsif ($option eq "a") { &addentry; } elsif ($option eq "r") { &unavailable; #&removeentry; } elsif ($option eq "s") { &unavailable; #&search; } elsif ($option eq "q") { $x = 2; } else { print "Invalid Entry!"; } } sub entrylist { system "cls"; ### Open the filehandle for the address database, sort it. open ADDYBASE, "addybase.adb" or die "\nRequired source database file does not exist!"; foreach (<ADDYBASE>) { $ent++; chomp ($line = $_); @entry = split /:/, $line; # Now that they're separated, store them into individual arrays $t_entry = pop @entry; push (@email, $t_entry); $t_entry = pop @entry; push (@fname, $t_entry); $t_entry = pop @entry; push (@lname, $t_entry); } close ADDYBASE; ### Ensure you start at 0 $ent -=1; for (0..$ent) { print "\n", $lname[$_],", ", $fname[$_], " -- ", $email[$_]; } &orquit; } sub orquit { print "\n\nPress enter to continue, enter Q to quit\n"; chomp ($option = <STDIN>); $x = 2 if ($option eq "q"); } sub unavailable { print "This function is not yet available. Sorry."; &orquit; } sub addentry { open ADDYBASE, ">>addybase.adb"; @addtemp = undef; system "cls"; print "First name?\n"; chomp ($addfname = <STDIN>); push (@addtemp, $addfname); print "\nLast name?\n"; chomp ($addlname = <STDIN>); push (@addtemp, $addlname); print "\nEmail address?\n"; chomp ($addemail = <STDIN>); push (@addtemp, $addemail); $addstring = (join ":", @addtemp); print ADDYBASE ("\n", $addstring); close ADDYBASE; }
Thanks in advance!

Replies are listed 'Best First'.
Re: Learning Exercise
by Abigail-II (Bishop) on Jul 18, 2002 at 13:18 UTC
    $ /opt/perl/5.8.0-RC3/bin/perldoc -q 'delete a line'
    Found in /opt/perl/5.8.0-RC3/lib/5.8.0-RC3/pod/perlfaq5.pod
           How do I change one line in a file/delete a line in a
           file/insert a line in the middle of a file/append to the
           beginning of a file?
                   Use the Tie::File module, which is included in the
                   standard distribution since Perl 5.8.0.
    The Tie::File module should be available on CPAN as well.

    However, since you have a small database, I would use any of the DBM modules that come with Perl. This is the perfect task for them.


      The Tie::File module looks very handy, and I think it'll work well here.

      I'll look at the DBM modules, too.
Re: Learning Exercise
by atcroft (Abbot) on Jul 18, 2002 at 12:52 UTC

    If this is a single-user program, one way you could handle the delete is to rename the file containing your addresses (addybase.adb) to something else (say, addybase.old), then open it for read, dumping all but those you are wishing to delete back to addybase.adb. (This also saves having to read all of the addresses into memory at the same time.)

    Later on, you might also want to look at tie or some of the Tie::* modules on CPAN, or the perl DBI, for easier ways to handle the address. Some of the examples I have seen for tie seem to treat the file access similar to a hash, making it easy to reach an entry quickly. Likewise, using the perl DBI with a database and SQL statements also makes access fairly easy.

    Hope that helps somewhat.

      I'd really prefer not to make another file, but that does sound feasible. I'll look into that stuff when I learn a little more.

      In the meantime...*scours the Llama for ideas*

      <idea>I'm basically trying to delete a line (which I will be loading into an array, thus, giving it an index with which to remove it by) and then shifting the entries thereafter, but not the ones before the entry I'm removing...

      Any ideas on doing that?
        You might want to read the perldoc for grep() for a way to remove an entry from an array based on known content.
Re: Learning Exercise
by moxliukas (Curate) on Jul 18, 2002 at 12:46 UTC

    I am not a good perl programmer, so take my advice with a grain of salt. The way I would implement the address book entry removal would be to remove the relevant hash entries and write out all hashes to the file, replacing the old address book. For this you will need to open the addybase.adb like this:

    open ADDYBASE, ">addybase.adb";

    There are a few BUTs with this approach:

    • It is not terribly efficient, especially if there are a lot of addressbook entries
    • This approach may be only useful as a quick hack

    I am sure some perlmonks will suggest better alternatives, I just wanted to give my (hopefully useful) thoughts about the subject.

      Yes, that would work, as I've done something very similar in BASIC before, but you're right: it's not efficient.

      My other question, which I forgot to post, but you can see in the code, is this: I want to search for a particular entry. What's the best way to do that, given the code I've written and my fairly basic understanding of Perl thusfar?
Re: Learning Exercise
by Abstraction (Friar) on Jul 18, 2002 at 13:39 UTC
      Ah. That is helpful. In fact, that helps solve another problem, too. Many thanks.
Re: Learning Exercise
by fsn (Friar) on Jul 18, 2002 at 14:59 UTC
    You are doing a lot of opening and reading and closing and so on. Modern OSes have lots of diskcache, so it's not a problem, but it probably doesn't scale very well. Also, I personally find it hard to work with data that way.
    I've always considered hashtables as the best thing since sliced bread. Hashtables were always hard to store on disk, but with the excellent DB_File module you can tie a hash to a diskstored db-database, and use it as if it were an in-memory hash table. No need to worry about the file format or loading-and-storeing the database. Just remember to untie it before you terminate the script.
    And since it is a hash you can perform all the usual hashtabel operations, assigning, retrieving and deleteing (which solves your original problem). You can do foreach $key {sort keys %hash} and do a linear search if you want.

    And then you can do hashes of different tied-in hashes, stored in different files. I find that a combination of perverse and convenient.

      Hmm...interesting thoughts. Sounds like a good idea, overall. I'm going to try to finish this programme as is, though (just for the sake of completion) and then work on a better one using the methods you described. People keep saying that hashes are the way to go here.
Re: Learning Exercise
by PhiRatE (Monk) on Jul 19, 2002 at 00:28 UTC
    I have something to say. Abstraction. Wait, let me repeat that a few times.

    Ok, got that out of my sysAbstractiontem.

    You do not have one problem, that of a piece of software that allows you to make entries in an address book, you have two.

    1. A piece of software that allows me to persistantly store rows of structured data.
    2. A piece of software that provides a user interface to 1.

    The reason for breaking it up into concepts like that (a large application will have tens or hundreds of such "problems") is that you can then go through and replace entire sections wholesale with already-available applications.

    It doesn't take a genius to realise that 1. is A Database. Thats exactly what you want, a database. Now, maybe for convenience you'd rather not install MySQL or PostgreSQL or DB2 just to run your address book, and with the specs as they stand I wouldn't blame you (I already have MySQL everywhere so I'd use that if I were solving the same problem), however that doesn't mean you can't take advantage of other, simpler database systems, notably the Berkley DB system (see perldoc NDBM_File and others suggested in these comments).

    Ok, so, for the sake of argument/learning, lets say you decide to implement both 1 and 2 yourself. The important thing to understand is that that doesn't mean that they're no longer 1 an 2, they're still best implemented as seperate entities. An example interface would be this:

    sub list_entries { my ($filename) = @_; ... return { $key => { ...}, $key => {...}, .. ]; } sub add_entry { my ($filename, $key, %data) = @_; ... sub delete_entry { my ($filename, $key) = @_; ... sub update_entry { my ($filename, $key, %data) = @_; ....

    Where $key represents the unique identifier for that record (you might consider it the first name + last name, I tend to use numerics under such circumstances since duplicate names are all too common). Although to be honest, I think it would work out better as an object or tie.

    Once you have defined that interface, a lot of the pain goes away. You have a nicely defined set of places where, if you so chose, you could convert the code to NDBM or MySQL without screwing up unrelated interface code, you also have an obvious set of logical operations, so it is easy to visualise which may require locking or equivalent to work properly, and further abstraction of the file opening is made simple by the common interface.

    You could then wipe your mind clean of everything to do with low-level locks, file operations, seeks() etc, and write the user interface. Trust me when I say that one of the best qualities a good coder can have is the ability to write code in such a way that, writing or reading, it requires as little in-brain memory as possible. Abstraction, OO, structured programming, all these are essentially attempts at 1) reusing code 2) allowing the programmer to forget things.

    If I were writing this same app, under these circumstances, I would write it like this:

    package MyDatabase; sub new { my $class = shift; my $self = {}; bless $self, $class; return $self; }; sub set_schema { my ($self, @fields) = @_; $self->{'fields'} = [@fields]; }; sub set_filename { my ($self, $filename) = @_; $self->{'filename'} = $filename; }; sub read_entries { my ($self) = @_; $self->{'data'} = {}; open(F,"<".$self->{'filename'}) || die "Could not open ".$self->{' +filename'}." to read"; while (my $line = <F>) { my ($key, @field_values) = split(/:/,$line); for (@{$self->{'fields'}}) { $self->{'data'}{$key}{$_} = shift(@field_values); } } close(F); }; sub write_entries { my ($self) = @_; open(F,">".$self->{'filename'}) || die "Could not open ".$self->{' +filename'}." for writing"; for my $key (keys %{$self->{'data'}}) { # Prints out the key then the data fields in order. print F join(':',($key, map {$self->{'data'}{$key}{$_}} @{$sel +f->{'fields'}}))."\n"; } close(F); } sub list_entries { my ($self) = @_; $self->read_entries(); return %{$self->{'data'}}; } sub set_entry { my ($self, $key, %data) = @_; eval { $self->read_entries(); }; # drop exceptions for the read on + set, if the db doesn't exist, we'll create it. $self->{'data'}{$key} = \%data; $self->write_entries(); } sub delete_entry { my ($self, $key) = @_; $self->read_entries(); delete $self->{'data'}{$key}; $self->write_entries(); }

    Holy Smoke! you say, Thats almost as long as my entire program previously! and to that I say, well, yes. But, it works. Not only does it work, but it will work in the general case of wanting to store rows of data, not just the trivial case of wanting to store address data. Further, you can update a given element correctly, you don't have globals flying around all over the place risking life and sanity as they conflict with other parts of the code, it is trivial to modify to utilise more efficient backends, such as a RDBMS or NDBM, no code is duplicated, we open for reading only in one place, we open for writing only in one place, thus locking would be a trivial addition. AND you can just plug it in to any other application with similar requirements.

    The objectives of good programming are to avoid ever writing the same thing twice. Unfortunately, this has to balance against the objective of getting things done now. Obsessive generalisation results in you writing a turing machine :). This kind of problem however is absolutely ripe for effective generalisation, as I have demonstrated.

    I recommend considering any programming problem in this light. How much more time will it take you to make it work for the general case? it took me 9 minutes and 25 seconds to write and debug that code (I counted :), longer perhaps than if I had simply integrated it all into the GUI and done both, but now I can add locking in a couple of lines, add NDBM support with a couple of minor changes, etc etc. Time saved in the future is often far more valuable than time saved now. As always, careful when you bet :)

    Regarding the general style of your code, I highly recommend reading one of the many styleguides on the 'net. I of course prefer my own ( but thats not to say there aren't many valid styles which promote readable, maintainable code. Any exercise in learning programming should be accompanied by an exercise in making the code readable and usable. Add to that concepts like automated test-suites and in-code documentation (w00t! POD) when you can, for extra bonus points and feelings of pride that you just can't share with anyone who isn't a programmer (:/).

    Most of all, enjoy yourself, programming is an art as well as a science, making something beautiful is all in the knowing and the taking the time.

    (bugs in the above class are possible, although naturally I like to think myself above such things :)


    my $db = new MyDatabase(); $db->set_filename("test.db"); $db->set_schema("first name","last name","email"); $db->set_entry("testentry",( "first name" => "Phi", "last name" => "RatE", "email" => '')); my %entries = $db->list_entries(); for (keys %entries) { for my $field (keys %{$entries{$_}}) { print "$field -> ".$entries{$_}{$field}."\n"; } }
Re: Learning Exercise
by debiandude (Scribe) on Jul 18, 2002 at 18:55 UTC
    Okay. Ill play. You really shouldn't need all those externs, they're completely unnecessary. Also their is more than one type of loop, you should have to rely on setting that $x variable. Here is a quick hack I wrote up:
    #!/usr/bin/perl use strict; do { cls(); print "Address Book\n"; $_ = <>; if (/l/) { entry_list(); } elsif (/a/) { entry_add(); } elsif (/r/) { entry_rem(); } elsif (/s/) { entry_sea(); } else { print "Invalid Entry\n" if not /[lars(q|quit)]/; } } until(/q|quit/); exit; sub entry_list { cls(); open ADBK, "<adbk.db" || die "Datebase file does not exist\n"; while(<ADBK>) { print join(" ", split /:/); } close ADBK; print "Press any key to contine...\n"; $_ = <>; } sub entry_rem { my $tmp = (); my $num = 0; print "Enter the infomation:\n"; foreach("First Name: ", "Last Name: ", "Email: ") { print; chomp(my $line = <>); $tmp .= "$line:"; } cls(); open ADBK, "<adbk.db" || die "Could not open database to read\n"; my @file = <ADBK>; close ADBK; open ADBK, ">adbk.db" || die "Could not open database to open\n"; foreach ( @file) { next if /$tmp/; $num++; print ADBK $_; } close ABDK; print "No entries found\n" if $num == 0; print "Press any key to contine...\n"; $_ = <>; } sub entry_sea { my $tmp = (); my $num = 0; print "Enter the infomation:\n"; foreach("First Name: ", "Last Name: ", "Email: ") { print; chomp(my $line = <>); $tmp .= "$line:"; } cls(); open ADBK, "<adbk.db" || die "Could not open database to read\n"; while(<ADBK>) { if(/$tmp/) { print "Entry ", ++$num, ":\n", join("\n", split /:/) +; } } close ABDK; print "No entries found\n" if $num == 0; print "Press any key to contine...\n"; $_ = <>; } sub entry_add { my $tmp = (); foreach("First Name: ", "Last Name: ", "Email: ") { print; chomp(my $input = <>); $tmp .= $input.":"; } open ADBK, ">>adbk.db" || die "Could not open database to append\n"; print ADBK $tmp, "\n"; close ABDK; } sub cls { print "\033[2J"; print "\033[0;0f"; }
      I like a function dispatch hash better:
      # Before the loop my %action = ( l => \&entry_list, a => \&entry_add, r => \&entry_rem, s => \&entry_sea, o => sub { print "Invalid Entry\n" }, ); # Within the loop ($action{lc()} || $action{o})->();
      Hmm...that makes a whole lot of sense. I do, however, have some questions:

      1. I understand how if (/l/) { entry_list(); } works, but can you explain the premise behind /l/? Is that just how you list things in perl? Also...I knew there were cases when you could leave the & off when calling a sub, but how does that work?

      2. || is equilivent to or?

      3. Kind of like #2, does | just allow you to specify more than one option in an 'or' fashion?

      4. What's the deal with subroutine cls?!

      5. my $tmp = (); just sets $tmp to undef, right?

      Alright, I believe that's all I have for now. Forgive the inexperience, please!
        • 1a. /l/ says if there is an 'l' in $_ return true. This means that if you type ' l ' ('l' with some spaces) it'll still match. It will also match on 'list', 'look' and 'isn't that a nice whale over there?'. Personally I'd have done:
          s/\s//; # remove all whitespace from $_ if($_ eq "l") { entry_list(); } elsif($_ eq "a") { addentry();} #etc
          $_ is equivalent to your $option.
        • 1b. &s can be left off subroutine calls if the call is unambiguously a subroutine call. In this case (calling the subroutine with a pair of parens) the parens make it unambiguous. Calling the subroutine without parens but with & means that the subroutine gets given the contents of @_ as it's argument list. This is discouraged because it often frightens people.

          So there are these ways to call a subroutine (ignoring object methods):

          subroutine; # doesn't pass strict. &subroutine; # passes @_ as argument list &subroutine(); # looks weird, passes arguments in () subroutine(); # generally preferred method
        • 2. || does mean "or" but they are not equivalent in associativity. Look at "perldoc perlop" for more information.
        • 3. | allows alternation in regular expressions (those things between the //s). So yes, it does allow you to specify more than one option in an or-like fashion.
        • 4. Looks like it probably clears the screen/terminal. Have you tried using it?
        • 5. Yep. I think debiandude had originally intended tmp to be an array. my $tmp; has essentially the same effect. Of you can use my $tmp = undef;.
        Hope it helps


        >>2. || is equilivent to or? Yes also, && is and, ne is !=, not is ! and so on...

        >>. Kind of like #2, does | just allow you to specify more than one option in an 'or' fashion? | is the regex is the aleternation operator. Its pretty cool. You should check it out, very powerful :-)

        >> 4. What's the deal with subroutine cls?! Well you were using system("clear") or something like that which is not really portable. Those prints are the ansi escape sequence for clearing the screen and putting the cursor on the upper left corner.

        >> 5. my $tmp = (); just sets $tmp to undef, right? Yes. I was orgianally going to do something with it, besides being just a scalar but I didn't have enough time. So I tured it back into a scalar and never changed the parens, they're unnecessary. As I said it was a quick hack.
Re: Learning Exercise
by John M. Dlugosz (Monsignor) on Jul 18, 2002 at 17:55 UTC
    It might not be what you're specifically studying, but you could look into the persistant database stuff. Tie a hash to the file and the rest is magic--delete the entry from the hash and its persistant representation goes away, too. I don't know off hand if there is anything for tying an array to a file, but you could always assign "record numbers" as the hash key.