string comparison

Madam has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: string comparison by BrowserUk (Patriarch) on May 27, 2005 at 07:23 UTC
You could just rejoin your strings after spliting and sorting and then compare them: #! perl -slw use strict; sub reorder { my $string = shift; return join ',', sort{ $a cmp $b } split ',', $string; };; my $string1 = '~cake,pastry'; my $string2 = 'pastry,~cake'; my $string3 = 'cake,pastry'; print "$_->[ 0 ] eq $_->[ 1 ]", reorder( $_->[ 0 ] ) eq reorder( $_->[ 1 ] ) ? ' match ' : ' dont match' for [ $string1, $string2 ], [ $string1, $string3 ], [ $string2, $string3 ]; __END__ [ 8:16:14.39] P:\test>junk2 ~cake,pastry eq pastry,~cake match ~cake,pastry eq cake,pastry dont match pastry,~cake eq cake,pastry dont match [download] Not hugely efficient, but probably not too bad if there are only a few elements in your strings. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.	[reply] [d/l]
Re^2: string comparison by reasonablekeith (Deacon) on May 27, 2005 at 08:51 UTC
Seems like an overly complicated example. Isn't it just as simple as... ? `my $string1 = '~cake,pastry'; my $string2 = 'pastry,~caske'; if (sort_text($string1) eq sort_text($string2)) { print "match\n"; } sub sort_text { join(',', sort split(/,/, $_[0]) ) }` [download] --- my name's not Keith, and I'm not reasonable.	[reply] [d/l]
Re^3: string comparison by tphyahoo (Vicar) on May 27, 2005 at 10:02 UTC
Extending this logic a bit, and using <DATA> to read in the test data: `use strict; use warnings; my ($first_line, $current_line); while (<DATA>) { chomp; # get rid of newline # get the first line sorted if ($. == 1) { $first_line = join ",", sort( split /,/,); } else { #die if doesn't match the first line sorted. $current_line = join ",", sort( split /,/,); die "$_ didn't match first line" unless $current_line eq $firs +t_line; } } __DATA__ ~cake,pastry,donuts,meringue pastry,~cake,meringue,donuts meringue, donuts,pastry,~cake,meringue` [download] This dies on the third line, as it should.	[reply] [d/l]
Re: string comparison by monarch (Priest) on May 27, 2005 at 07:12 UTC
It appears that you want to know if all the elements in one string are in the other. Assumptions: the following strings will match: - "cake,cake,patty", "cake,patty,cake" - "bread,water", "water,bread" The following strings won't match: - "cake,cake,patty", "patty,cake" - "brad,water", "water,bread" `# check_strings - takes two parameters, returns true if same sub check_strings( $ $ ) { # arguments my ( $string1, $string2 ) = @_; my @elements1 = sort split(',',$string1); my @elements2 = sort split(',',$string2); if ( scalar(@elements1) <=> scalar(@elements2) ) { return( undef ); # differing number of elements } for ( my $i = 0; $i < scalar(@elements1); $i++ ) { if ( $elements1[$i] cmp $elements2[$i] ) { return( undef ); # element differs } } return( 1 ); # elements are all the same }` [download]	[reply] [d/l]
Re: string comparison by ihb (Deacon) on May 27, 2005 at 10:52 UTC
There's no restraints on these techniques. The words needn't be unique in the string nor do they have to be of equal length. Solution 1: Cancel out common elements. `use List::Util qw/ first /; my %count; $count{$_}++ for split /,/, $str1; $count{$_}-- for split /,/, $str2; my $different = defined first { $_ } values %count;` [download] The last line says that they're different if there is any value that is non-zero, i.e. any element not cancelled out. Solution 2: Use `&Test::More::eq_array` `require Test::More; my $equal = Test::More::eq_array( [ split ',', $str1 ], [ split ',', $str2 ], );` [download] `ihb` See perltoc if you don't know which perldoc to read!	[reply] [d/l] [select]
Re: string comparison by davidj (Priest) on May 27, 2005 at 07:22 UTC
Assuming that strings "match" if they have the same number of comma delimited words, the following will do the trick: `#!/usr/bin/perl use strict; my (%str1, %str2); my $str1 = "~cake,pastry"; my $str2 = "pastry,~cake"; my @str1 = split(",", $str1); my @str2 = split(",", $str2); map { $str1{$_}++ } @str1; map { $str2{$_}++ } @str2; foreach $key (keys %str1) { print "not equal\n" if $str1{$key} != $str2{$key}; }` [download] davidj	[reply] [d/l]
Re: string comparison by robartes (Priest) on May 27, 2005 at 07:16 UTC
Here's one, naieve and probably inefficient, solution: `#!/usr/local/bin/perl -w use strict; my @strings=("~cake,pastry","pastry,~cake","cake,pastry"); foreach my $i ( 0..2 ) { for ( 0 .. 2) { next if ( $i == $_ ); my %left=map { $_ => 'nevermind' } split /,/,$strings[$i]; my %right=map { $_ => 'nevermind' } split /,/,$strings[$_]; for (keys( %left )) { delete $left{$_} if exists $right{$_}; } print "String $i matches $_!\n" unless ( keys %left \|\| keys %right + ); } } __END__ Output: String 0 matches 1! String 1 matches 0!` [download] Never mind the inefficient loop within loop within loop. The important bit to this method is splitting the strings into a hash, then walking over one hash and deleting the keys that exist in the other. If anything is left, the strings do not match. One quick optimisation is to immediately stop and declare inequality when you find that a key in one hash does not exist in the other - that will save you iterating over the entire hash in the worst case. *Update:* Check both hashes after delete step. See reply to this post from whatluo as to why. Note that the more efficient solution to this would be as whatluo suggests -- report inequality when the number of keys does not match between the two hashes before going into the delete loop. The solution I offer leaves the differing keys in the hashes, so you can do something with them. CU Robartes-	[reply] [d/l]
Re^2: string comparison by whatluo (Novice) on May 27, 2005 at 09:18 UTC
Read more... (330 Bytes)	[reply]
Re^3: string comparison by robartes (Priest) on May 27, 2005 at 09:51 UTC
You're right - thanks. I've updated my code to check both hashes for leftover keys, which should catch that problem. CU Robartes-	[reply]
Re: string comparison by tphyahoo (Vicar) on Jun 23, 2005 at 11:07 UTC
See also comparsion of 2 arrays, also by madam. Seems to me to belong to the same problem space, and there is a bit of additional insight there.	[reply]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks