Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

string comparison

by Madam (Sexton)
on May 27, 2005 at 06:57 UTC ( [id://460957]=perlquestion: print w/replies, xml ) Need Help??

Madam has asked for the wisdom of the Perl Monks concerning the following question:

HI, I want to find whether 2 strings are matching... consider the 2 strings,
my $string1 = ~cake,pastry my $string2 = pastry,~cake
the result should be the string1 & string2 are they are having same words in different order. and if the 2 strings are
my $string1 = cake,pastry my $string2 = ~cake,pastry
then the result should be the 2 strings are not matching... though they are in the same order,"cake" is different from "~cake" I searched in Q&A section but didnot find the correct solution.sorry to bother if the answere is already present here. -Madam

Replies are listed 'Best First'.
Re: string comparison
by BrowserUk (Patriarch) on May 27, 2005 at 07:23 UTC

    You could just rejoin your strings after spliting and sorting and then compare them:

    #! perl -slw use strict; sub reorder { my $string = shift; return join ',', sort{ $a cmp $b } split ',', $string; };; my $string1 = '~cake,pastry'; my $string2 = 'pastry,~cake'; my $string3 = 'cake,pastry'; print "$_->[ 0 ] eq $_->[ 1 ]", reorder( $_->[ 0 ] ) eq reorder( $_->[ 1 ] ) ? ' match ' : ' dont match' for [ $string1, $string2 ], [ $string1, $string3 ], [ $string2, $string3 ]; __END__ [ 8:16:14.39] P:\test>junk2 ~cake,pastry eq pastry,~cake match ~cake,pastry eq cake,pastry dont match pastry,~cake eq cake,pastry dont match

    Not hugely efficient, but probably not too bad if there are only a few elements in your strings.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      Seems like an overly complicated example. Isn't it just as simple as... ?
      my $string1 = '~cake,pastry'; my $string2 = 'pastry,~caske'; if (sort_text($string1) eq sort_text($string2)) { print "match\n"; } sub sort_text { join(',', sort split(/,/, $_[0]) ) }
      my name's not Keith, and I'm not reasonable.
        Extending this logic a bit, and using <DATA> to read in the test data:
        use strict; use warnings; my ($first_line, $current_line); while (<DATA>) { chomp; # get rid of newline # get the first line sorted if ($. == 1) { $first_line = join ",", sort( split /,/,); } else { #die if doesn't match the first line sorted. $current_line = join ",", sort( split /,/,); die "$_ didn't match first line" unless $current_line eq $firs +t_line; } } __DATA__ ~cake,pastry,donuts,meringue pastry,~cake,meringue,donuts meringue, donuts,pastry,~cake,meringue
        This dies on the third line, as it should.
Re: string comparison
by monarch (Priest) on May 27, 2005 at 07:12 UTC
    It appears that you want to know if all the elements in one string are in the other.

    Assumptions: the following strings will match:
    - "cake,cake,patty", "cake,patty,cake"
    - "bread,water", "water,bread"
    The following strings won't match:
    - "cake,cake,patty", "patty,cake"
    - "brad,water", "water,bread"

    # check_strings - takes two parameters, returns true if same sub check_strings( $ $ ) { # arguments my ( $string1, $string2 ) = @_; my @elements1 = sort split(',',$string1); my @elements2 = sort split(',',$string2); if ( scalar(@elements1) <=> scalar(@elements2) ) { return( undef ); # differing number of elements } for ( my $i = 0; $i < scalar(@elements1); $i++ ) { if ( $elements1[$i] cmp $elements2[$i] ) { return( undef ); # element differs } } return( 1 ); # elements are all the same }
Re: string comparison
by ihb (Deacon) on May 27, 2005 at 10:52 UTC

    There's no restraints on these techniques. The words needn't be unique in the string nor do they have to be of equal length.

    Solution 1: Cancel out common elements.

    use List::Util qw/ first /; my %count; $count{$_}++ for split /,/, $str1; $count{$_}-- for split /,/, $str2; my $different = defined first { $_ } values %count;
    The last line says that they're different if there is any value that is non-zero, i.e. any element not cancelled out.

    Solution 2: Use &Test::More::eq_array

    require Test::More; my $equal = Test::More::eq_array( [ split ',', $str1 ], [ split ',', $str2 ], );


    See perltoc if you don't know which perldoc to read!

Re: string comparison
by davidj (Priest) on May 27, 2005 at 07:22 UTC
    Assuming that strings "match" if they have the same number of comma delimited words, the following will do the trick:

    #!/usr/bin/perl use strict; my (%str1, %str2); my $str1 = "~cake,pastry"; my $str2 = "pastry,~cake"; my @str1 = split(",", $str1); my @str2 = split(",", $str2); map { $str1{$_}++ } @str1; map { $str2{$_}++ } @str2; foreach $key (keys %str1) { print "not equal\n" if $str1{$key} != $str2{$key}; }
Re: string comparison
by robartes (Priest) on May 27, 2005 at 07:16 UTC

    Here's one, naieve and probably inefficient, solution:

    #!/usr/local/bin/perl -w use strict; my @strings=("~cake,pastry","pastry,~cake","cake,pastry"); foreach my $i ( 0..2 ) { for ( 0 .. 2) { next if ( $i == $_ ); my %left=map { $_ => 'nevermind' } split /,/,$strings[$i]; my %right=map { $_ => 'nevermind' } split /,/,$strings[$_]; for (keys( %left )) { delete $left{$_} if exists $right{$_}; } print "String $i matches $_!\n" unless ( keys %left || keys %right + ); } } __END__ Output: String 0 matches 1! String 1 matches 0!

    Never mind the inefficient loop within loop within loop. The important bit to this method is splitting the strings into a hash, then walking over one hash and deleting the keys that exist in the other. If anything is left, the strings do not match.

    One quick optimisation is to immediately stop and declare inequality when you find that a key in one hash does not exist in the other - that will save you iterating over the entire hash in the worst case.

    Update: Check both hashes after delete step. See reply to this post from whatluo as to why. Note that the more efficient solution to this would be as whatluo suggests -- report inequality when the number of keys does not match between the two hashes before going into the delete loop. The solution I offer leaves the differing keys in the hashes, so you can do something with them.


        You're right - thanks. I've updated my code to check both hashes for leftover keys, which should catch that problem.


Re: string comparison
by tphyahoo (Vicar) on Jun 23, 2005 at 11:07 UTC
    See also comparsion of 2 arrays, also by madam. Seems to me to belong to the same problem space, and there is a bit of additional insight there.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://460957]
Approved by monkfan
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-23 21:02 GMT
Find Nodes?
    Voting Booth?

    No recent polls found