Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Delete from array

by Smaug (Pilgrim)
on Aug 07, 2003 at 09:07 UTC ( [id://281798]=perlquestion: print w/replies, xml ) Need Help??

Smaug has asked for the wisdom of the Perl Monks concerning the following question:

Hello providers of great wisdom,

I am just starting off with perl and have what I can only imagion is a really stupid question!

I have two arrays. The one contains a list of files that should exist in a given directory (@shouldexist). The second array contains all the files that actually do exist.

What I want is to print out the something like "The file xxxx.yyy exists!" and then once I have run through all those that do exist (@doexist), print out "These files should be there but aren't: aaaa.bbb, cccc.ddd"

I thought I would do this by deleting from @shouldexist all values in @doexist while printing out the file exists. Once that is done, @shouldexist will only contain values of files that do not exist, and I could then print these.

Ho do I delete from an array based on it's existence in another array? and is this the 'correct' was to do this?
I am not looking for somebody to do this for me, but a some advice.

Thanks!!

Replies are listed 'Best First'.
Re: Delete from array
by Aristotle (Chancellor) on Aug 07, 2003 at 09:17 UTC
    Whenever you think "exists", think of exists - which means using a hash. Put the existing files as keys in a hash (the value of the entry is irrelevant). Then grep against the should-exist list.
    my %have_file; undef @have_file{glob("*")}; my @dont_exist = grep !exists $have_file{$_}, @should_exist;

    On the second line, what I'm using is called a slice. Rather than selecting a single entry from the hash, I select a list - here, the list of filenames returned by glob (of course any other way you get your file names is applicable as well). Then I undef all of these hash entries into existence.

    Actually the way I'd write this in my own code is like so:

    my @dont_exist = do { my %have_file; undef @have_file{glob("*")}; grep !exists $have_file{$_}, @should_exist; };

    Makeshifts last the longest.

Re: Delete from array
by liz (Monsignor) on Aug 07, 2003 at 09:19 UTC
    Don't use an array for @shouldexist, but use a hash for that:
    my %shouldexist = map { $_ => undef } @shouldexist;
    Now you have a hash in which the filenames are keys. Since we're only interested in the existence of a key, I've used the value "undef" as the value associated with the key.

    Then, whenever you find that a file exists, you remove the corresponding key from the hash.

    delete $shouldexist{ $filename };
    Then, when you're done, you check which keys are still left: that is then your list of files that are missing:
    print "Missing:\n"; foreach (sort keys %shouldexist) { print "$_\n"; }
    Note that I added the "sort" here because the order of the keys in the hash is indeterminate (even random in 5.8.1).

    I leave the checking for the existence of the file as an excercise to the reader.

    Liz

      Note this does not preserve the order of files in @should_exist.

      Makeshifts last the longest.

Re: Delete from array
by Corion (Patriarch) on Aug 07, 2003 at 09:32 UTC

    This is a problem I tackle daily (in Python). After I had implemented the two-way compare you mentioned, I longed for the remaining two classes of elements as well, and ended up with four classes of entries in two lists :

    • Elements that are in both lists and are equal
    • Elements that are in both lists and are not equal
    • Elements that are only in list A
    • Elements that are only in list B
    In your problem, the second class would always be empty, as you have only the name of a file. If you would, for example also add the size of a file to be compared, the name of the file would still be the unique identifier, but two files could have the same name and still differ in the size.

    I looked around, but there is no module for comparing two arrays and dividing them up into the four classes in Perl (there also was nothing comparable in Python, but with Python, I'm used to writing my own stuff :-)).

    So here is my algorithm of how I do this :

    1. We need a method to extract some semblance of key from each item
    2. Since we will not know whether we will have a unique key for each element, we must compare bags of items that share the same key. We can select any (i.e. the first) item from a bag to pair.
    3. Read all items from list A, and put them in a hash. The key is the key extracted for each item. As this key is not unique, the hash entries will be lists of items.
    4. For each item in list B, look in the hash :
      • If there is an item in the hash entry, remove that item and put both items in the "found" part of the result list.
      • If there is no item in the hash entry, put the item into the "only in list B" part of the result list.
    5. Put all remaining items from the hash into the "only in list A" part of the result list.
    6. Now divide the "found" part up into "equal" and "different" parts by comparing the items closer.
    7. Return the "equal","different","only in list A" and "only in list B" parts.

    Work interferes, so I won't write up the implementation - watch this space for an update

    Update:(untested though)

    =pod extract_key takes an element from a list and returns a scalar that is +the key element. Think of MD5. =cut sub extract_key { # blindly return the item itself, stringified. return "@_"; }; sub compare_items { # plain string identity comparision $_[0] eq $_[1] }; sub compare_lists { my ($list_a, $list_b) = @_; my (%dict); my %result = ( equal => [], different => [], only_a => [], only_b => [], ); for my $item_a (@$list_a) { my $key = extract_key($item_a); $dict{$key} = [] unless exists $dict{$key}; push @$dict{$key}, $item_a; }; my @found; for my $item_b (@$list_b) { my $key = extract_key($item_b); if (exists $dict{$key}) { if (@$dict{$key}) { my $item_a = shift @$dict{$key}; push @found, [ $item_a, $item_b ]; } else { push $result{only_b}, $item_b; }; }; }; push $result{only_a}, @$dict{$key} for my $key (keys %dict); for my $pair (@found) { if (compare_items( $pair->[0], $pair->[1] )) { push @$result{equal}, $pair; } else { push @$result{different}, $pair; }; }; return %result; };
    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
      (there also was nothing comparable in Python, but with Python, I'm used to writing my own stuff :-))

      In my mind, that would be a reason to avoid Python. *shrugs* TMTOWTDI means different languages too, I suppose.

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: Delete from array
by Zaxo (Archbishop) on Aug 07, 2003 at 10:10 UTC

    Aside from the good advice you've gotten on dealing with file system listings, there is an easy way to get the names which should exist and don't. Start, as already suggested, with a hash keyed by the names which should exist. We'll use hash slices,

    my %should_be; @should_be{@shouldexist} = (); delete @should_be{@doexist}; { local $, = ', '; print q(These files should be there but aren't: ), keys %should_be; }
    The bare block and localization of the ofs is just for dressing up the output.

    After Compline,
    Zaxo

Re: Delete from array
by cianoz (Friar) on Aug 07, 2003 at 09:23 UTC
    why don't you use a hash to store shouldexist so you can delete by name?
    something like
    %shouldexist = ( 'filename1' => 1, .. 'filenamen' => 1 ); #or just %shouldexist = map {$_ => 1} @shouldexist;
    so you can simply do
    for(@doexist) { delete $shouldexist{$_}; }
    if you whant to do the same using an array you have to scan @shouldexist on each iteration and delete when match (slower)
Re: Delete from array
by Skeeve (Parson) on Aug 07, 2003 at 10:40 UTC
    I doo it like this:
    my @shouldexist= qw( ismissing ismissing2 isthere isthere2 ); my @doexist= qw( isthere isthere2 istoomuch istoomuch2 ); my %missing, %toomuch; @missing{@shouldexist}=(); # now all files that should exist have an entry delete @missing{@doexist}; # now all that do exist don't have an entry anymore # now the same the other way round to find superfluous files @toomuch{@doexist}=(); delete @toomuch{@shouldexist}; print "Missing: ",join(' ',keys %missing),$/; print "Toomuch: ",join(' ',keys %toomuch),$/;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://281798]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-25 10:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found