Compare two lists of words

nasa has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Simple for some. by fruiture (Curate) on Aug 18, 2002 at 13:15 UTC
short question, short answer: `my @words = qw/foo bar/; #your @A my @strings = <DATA>; #your @B my $contains = 0; # a boolean { my $words = join '\|',map quotemeta,@words; for(@strings){ ++$contains and last if /$words/; } } print $contains ? 'yes' : 'no' , "\n" __DATA__ abc def ghi foo jkl mno` [download] To get a better answer: what does "array @B contains the words in array @A" mean? In which case is this fullfilled? I have assumed that one element of @B must contain one element of @A... -- http://fruiture.de	[reply] [d/l]
Re: Simple for some. by hotshot (Prior) on Aug 18, 2002 at 13:10 UTC
for the somple case of two entries in @A, this should be enough: `my $flag = 0; for (@B) { if (/$A[0]/ \|\| /$A[1]/) { # do this $flag = 1; last; } } if (! $flag) { # didn't find it in the array # do the other thing }` [download] Thanks. Hotshot	[reply] [d/l]
Re: Re: Simple for some. by nasa (Beadle) on Aug 18, 2002 at 14:33 UTC
To clarify the question. I have a data file which I bring in as an array It prints out like this `1028804576 nasa@dodo.com.au name password 1028804590 nasa@dodo.com.au name2 pass2 1028804598 who@where.com again anouther` [download] The other array I have is from a form with two entries. I need to check the form entries against the data base file to see if they exist or not. as password protection. Of course the two form entries can end up $hit and $strike the data base ends up @raw_data I tried this . `my $flag = 0; for (@raw_data) { if (/$hit[0]/ \|\| /$strike[1]/) { print "Found it."; $flag = 1; last; } } if (! $flag) { # didn't find it in the array print "didnt find it"; }` [download] No matter what I type in the form or whats in the data base it always comes back Found it. May god bless you guys...Nasa. edited: Sun Aug 18 15:35:38 2002 by jeffa - added code tags, removed unnecessary br tags	[reply] [d/l] [select]
(jeffa) 3Re: Simple for some. by jeffa (Bishop) on Aug 18, 2002 at 15:16 UTC
Sounds like you need a better data structure, like a hash of hashes (HoH). Also, i would store the two form variables as seperate values - $user and $pass, but i will keep that part of your requirements intact: `use strict; my %passwd; while (<DATA>) { my ($id,$email,$user,$passwd) = split; $passwd{$user} = { id => $id, email => $email, passwd => $passwd, }; } my @other_array = @ARGV[0,1]; my $user = $passwd{$other_array[0]} \|\| die "bad user"; die "bad password" unless $other_array[1] eq $user->{passwd}; __DATA__ 1028804576 nasa@dodo.com.au name password 1028804590 nasa@dodo.com.au name2 pass2 1028804598 who@where.com again anouther` [download] But, i would use a relational database to handle this sort of problem - at least consider using DBD::CSV. This example assumes the data file is named users.csv: `use strict; use DBI; my $dbh = DBI->connect( "DBI:CSV:f_dir=.;csv_eol=\n;csv_sep_char= ;", {RaiseError=>1}, ); $dbh->{csv_tables}->{'users'} = { file => 'users.csv', col_names => [qw(id email user passwd)], }; my $user = shift; my $pass = shift; my $sth = $dbh->prepare(" select id,email,user,passwd from users where user = ? and passwd = ? "); $sth->execute($user,$pass); my $valid = $sth->fetchrow_hashref; die "bad user/pass" unless $valid; my $id = $valid->{id}; my $email = $valid->{email};` [download] jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l] [select]
Re(3): Simple for some. by Arien (Pilgrim) on Aug 18, 2002 at 15:23 UTC
What you are describing now looks like something fairly different from what you seemed to be describing before. I'm not even sure if this can be answered without resorting to crystal balls... I'll assume that you want to check if two values you enter in a form appear as two of the fields on the same line of your data file (name and password). To do this, process the lines one by one, checking the two fields against the values that were entered. If they match, set a flag and break out of the loop. After the loop, check the flag to see if the input was "correct": `my $found = 0; for (@lines) { my (undef, undef, $theName, $thePass) = split; if ($theName eq $name && $thePass eq $pass) { $found++ and last; } } # check for value of $found, etc, etc` [download] — Arien	[reply] [d/l]
Re: Re: Re: Simple for some. by BrowserUk (Patriarch) on Aug 18, 2002 at 15:24 UTC
Your description is still not very clear? Of course the two form entries can end up $hit and $strike Why "Of course"? You talk about `$hit` and `$strike` and then go on to use them as if they are arrays with `if (/$hit[0]/ \|\| /$strike[1]/) {`? You also fail to mention what $hit and $strike actually represent? Making a lot of assumptions about what your code is trying to do: Assuming one of these vars is contains an userid, and the other the associated password, and that the code's intent is to check the password for the given userid, your method is fundementally flawed. You appear to be looping through array and checking if either appears anywhere in the file! This means that only one of the two has to be somewhere in the array and your `$flag` will be set true. In other words, I would only have to guess any userid or any password to pass your test!! Not good. Update:Example code withdrawn. All of that said, you really need to think about way you are implementing this as it is full of holes as far as a security mechanism is concerned. I strongly urge you to read perlsec and find out about the -T switch. What's this about a "crooked mitre"? I'm good at woodwork!	[reply] [d/l] [select]
Re(4): Simple for some. by Arien (Pilgrim) on Aug 18, 2002 at 15:29 UTC
Re: Re: Re: Simple for some. by BrowserUk (Patriarch) on Aug 18, 2002 at 18:54 UTC
The first problem with your code, and the reason it always comes back with "Found it. ", it this line: `if (/$hit[0]/ \|\| /$strike[1]/) {` [download] Here you are asking if either `$hit[0]` or `$strike[1]` is found within this line. Which means that it only needs for one of these to appear anywhere in the file, and "Found it." will be printed. In order to check that both exist in the same line you would need to do something like: `if (/$hit[0] && /$strike[1]/) {` [download] However, that's still not good enough because ( I'm going to use `$userid` instead of `$hit[0]` and `$password` instead of `$strike[1]` ). Let's say $userid = 'fred' and $password = 'mother' and the associated email_id is fred@yahoo.com, when the line of your array containing: `1028804576 fred@yahoo.com fred mother` [download] is checked, then `if ( /$userid/ && /$password/) {` will match and your on your way, but what happens if a line earlier in your file contained `1028804571 mum@yahoo mother fred # This would also match, and +would be found earlier # ^^^^^^ ^^^^ 1028804572 wilfred@hotmail wilf mother # This would also match + and be found earlier # ^^^^ ^^^^^^` [download] So, the next step is to make sure that a) they both appear in the right order and b) they are at the end of the line, something like this: `if ( /$userid[ \t]+$password$/ ) { # [ \t]+ means one or more spaces or tabs ($/) means at the end o +f the line.` [download] This is better, but now think about what happens if some nasty user fills in "." for $userid and $password? This means that the regex `/$userid[ \t]+$password$/` will become `/.[ \t]+.$/`, which will match any line that has a space or a tab in it! Again, not what you want at all. The way to avoid this is to use the quotemeta function or the \Q\E metaquote pairings see perlman:perlre. `$userid = quotemeta $userid; $password = quotemeta $password; if ( /$userid[ \t]+$password$/ ) {` [download] or just `if ( /\Q$userid\E[ \t]+\Q$password$\E/ ) {` [download] So, a version of your code with the minimum changes to make it work* would look something like Updated again to cover the hole Arien points out below! `#! perl -w my $flag = 0; for (@raw_data) { if ( defined $hit[0] and $hit[0] # has a value that is no +n-null and defined $strike[1] and $strike[1] # ditto and /\Q$hit[0]\E[ \t]+\Q$strike[1]\E$/) { print "Found it."; $flag = 1; last; } } if (! $flag) { # didn't find it in the array print "didnt find it"; }` [download] That said, reading your database into a hash as described in jeffa's and other answers is almost certainly a better way of doing what you want to do, and my earlier advice about reading perlman:perlsec and using the -T switch still stands. Finally, it would be a good idea to read Writeup Formatting Tips before you post another question to save the editors from having to reformat your posts to make them readable. What's this about a "crooked mitre"? I'm good at woodwork!	[reply] [d/l] [select]
Re(4): Simple for some. by Arien (Pilgrim) on Aug 18, 2002 at 19:14 UTC
Re: Re: Simple for some. by erikharrison (Deacon) on Aug 18, 2002 at 18:20 UTC
Actually, this will only work for very simple strings, as you area treating `$A[0]` and `$A[1]` as regex, and pattern metacharacters will be treated as such. When interpolating a string literally in a pattern, wrap it in `\Q . . . \E`. The counter intuitiveness of this (especially in a world with qr{}) is what inspired the change in Perl 6. Cheers, Erik Light a man a fire, he's warm for a day. Catch a man on fire, and he's warm for the rest of his life. - Terry Pratchet	[reply] [d/l] [select]
Re: Simple for some. by Arien (Pilgrim) on Aug 18, 2002 at 13:32 UTC
To find if a word `$a` is in a string `$b` you can use a regex to match a word boundary, the word you are looking for and a word boundary: `$b =~ /\b$a\b/;`. You can have Perl build such a regex using the words of your array by joining the words with the pipe symbol to seperate the alternatives. (You will have to group this part of the regex to require the word boundaries on either side of a word.) Using the regex Perl built for you, you can `grep` through the strings in `@B` to find the number of times there's a match, leading to: `$words = join '\|' => @A; if (grep /\b(?:$words)\b/, @B) { # do this } else { # do that }` [download] — Arien Edit: To just find out if there is a match, it is faster to use a loop and break out of it (after setting a flag) when you find a match like fruiture does.	[reply] [d/l] [select]


Think about Loose Coupling
	PerlMonks