Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Parsing Challenge

by symŽ (Acolyte)
on May 24, 2001 at 22:16 UTC ( #83042=perlquestion: print w/replies, xml ) Need Help??

symŽ has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
I need to parse a string that looks something like this:
$data="key=value key=value value key=value value value key=value key=v +alue key=value value";

as you can see some values are multiple values seperated by spaces.. Not sure of a good way to build a hash from this...
I am able to parse out the values with this bit of code
my @values=split(/\S+=/,$data);, but I can't figure out a good way to get the keys because of the varrying format of the values with spaces..
Any Ideas?

Replies are listed 'Best First'.
Re: Parsing Challenge
by Masem (Monsignor) on May 24, 2001 at 22:44 UTC
    my @off_pairs = split '=', $input; my %hash; my $lastkey = $off_pairs[0]; foreach ( 1 .. @off_pairs - 2 ) { my @array = split ' ', $off_pairs[ $_ ]; my $newlastkey = pop @array; $hash{ $lastkey } = join ' ', @array; $lastkey = $newlastkey; } $hash{ $lastkey } = $off_pairs[-1];

    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain

      Yeah, that is more what I was thinking. Here is a similar method:

      my $key; for( split /(?<==)/, $input ) { # Split after each "=" s#(^|\s+)(\S+)=$##; # Remove trailing key if( defined $key ) { # Not first time thru: $hash{$key}= $_; } $key= $2; # Note next key to use. }

              - tye (but my friends call me "Tye")
Re: Parsing Challenge
by gumpu (Friar) on May 24, 2001 at 22:54 UTC

    How about:

    my $data = "aap=10 20 30 noot=50 60 mies=20 teun=3"; my %foo; while ($data =~ m/(\S+)=(\S+($|([^=]+\s+)+))/g) { $foo{$1}=$2; }

    If needed $2 can be further split into indivual values.

    Have Fun

      Though we have arrived at very similar solutions, I think yours fails to pick up multiple values for the last key:
      use strict; use warnings; use Data::Dumper; my $data="key1=value1 key2=value2 value3 key3=value4 value5 value6 key +4=value7 key5=value8 key6=value9 value10"; # Albannach version my %hash; while ($data =~ /(\w+=)((\w+( |$)+)+)/g) { $hash{$1} = $2; } print Dumper \%hash; # gumpu version: my %foo; while ($data =~ m/(\S+)=(\S+($|([^=]+\s+)+))/g) { $foo{$1}=$2; } print Dumper \%foo;
      Produces:
      $VAR1 = { 'key4=' => 'value7 ', 'key5=' => 'value8 ', 'key1=' => 'value1 ', 'key6=' => 'value9 value10', 'key2=' => 'value2 value3 ', 'key3=' => 'value4 value5 value6 ' }; $VAR1 = { 'key1' => 'value1 ', 'key2' => 'value2 value3 ', 'key3' => 'value4 value5 value6 ', 'key4' => 'value7 ', 'key5' => 'value8 ', 'key6' => 'value9 ' };

      Update: The = in the key? You are right of course - silly me! Thanks gumpu!

      --
      I'd like to be able to assign to an luser

        It does not pick up the last value for the last key, should have tested it better. I also think that your version is more readable. Though I would leave out the = from the key.

        $hash{$1} = $2 while ($data =~ /(\w+)=((\w+( |$)+)+)/g);

        Have Fun

Re: Parsing Challenge
by myocom (Deacon) on May 24, 2001 at 22:48 UTC

    This sounds like a job for a hash of hashes! Using the code below, you can get at the keys by referring to keys %hash and you can get at all of the values of key2, for example, by referring to keys %{$hash{key2}}

    use strict; my $data = "key1=valuea key2=valuea valueb key3=valuea valueb valuec k +ey4=valuea key5=valuea key6=valuea valueb"; my %hash = ParseData($data); print "All of the keys are: ", join(" ",keys %hash),"\n\n"; print "Key2's keys are: ", join(" ",keys %{$hash{key2}}); sub ParseData { my $datastring = shift; my %data; my $currkey; foreach (split(' ',$datastring)) { if (/=/) { my $val; ($currkey, $val) = split('='); $data{$currkey}{$val}++; } else { $data{$currkey}{$_}++; } } return %data; }
Re: Parsing Challenge
by MeowChow (Vicar) on May 25, 2001 at 05:49 UTC
    I'm partial to these methods:
    my %hash = split /=|\b(?=\S+=)/, $data; $hash{$_} = [ split / /, $hash{$_} ] for keys %hash;
    or
    my %hash; $hash{$1} = [split / /, $2] while $data =~ /(\S+)=(.*?)(?=\S+=|$)/g;
    or to support repeated keys:
    my %hash; push @{$hash{$1}}, split / /, $2 while $data =~ /(\S+)=(.*?)(?=\S+=| +$)/g;
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
Re: Parsing Challenge
by converter (Priest) on May 25, 2001 at 07:57 UTC
    Assuming unique key values:
    DB<1> $data = 'key1=val val key2=val key3=val val val val key4=val v +al' DB<2> %hash = $data =~ /([^ =]+)=([^ =]+(?: [^ =]+(?= |$))*)/g DB<3> x %hash 0 'key1' 1 'val val' 2 'key2' 3 'val' 4 'key3' 5 'val val val val' 6 'key4' 7 'val val'
Re: Parsing Challenge
by gregor42 (Parson) on May 24, 2001 at 22:56 UTC

    Wow... When I started working on this there were no posts. Now there's 3.

    Well I'll post my own solution also, though I think I've already been outdone.

    My approach is simple. First split on the space to an array & then iterate over it looking for '='

    When it's found, assign to that key. Every value that follows is assumed to belong to that key until another is mentioned.

    You did say the purpose here was to assign multiple values an an individual key, yes?

    Well then assuming that your values don't contain commas, you could do something similar when extracting. Test for a ',' & if it exisits, split on it.

    Anyway, here's my effort, I hope it helps:

    #!e:\perl\bin\perl.exe -w use strict; my $data="key1=value1 key2=value2 value3 key4=value4 value5 value6 key +5=value7 key6=value8 key7=value9 value10"; my @values=split(/ /,$data); my ($h,$value, $v, $recentkey); my %output; my @hashkeys; for $value(@values) { if ($value =~ /=/) { ($recentkey, $v) = split(/=/,$value); $output{$recentkey} = $v; } else { $output{$recentkey} .= ",$v"; } } @hashkeys = keys %output; for $h(@hashkeys) { print "$h: $output{$h} \n"; }


    Wait! This isn't a Parachute, this is a Backpack!
Re: Parsing Challenge
by symŽ (Acolyte) on May 25, 2001 at 00:33 UTC
    All great ideas, thanks! I tried many of them and settled on a hybrid. I had to make a few somewhat ugly tweaks because some of the key names were repeated with different values (intended to be different keys with the same name) funky.. but I just tacked on a increment number to the key name. Hey whatever works right?
    Here is the code
    my $data="fruit=pear meat=chicken legs fruit=orange slices cheese=mont +erey jack meal=lunch meat=ribs bread= "; my %foo; while ($data =~ m/(\S+)=(\S*($|([^=]+\s+)*))/g) { my ($tmpkey,$tmpval); my $count=1; $tmpkey=$1; $tmpval=$2; while (defined $foo{$tmpkey}){ $count++; $tmpkey=~s/\_\d+$//; $tmpkey.="_$count"; } $foo{$tmpkey}=$tmpval; } for (sort (keys %foo)){ if ($_!~/_\d+/){ my $rm=$_; $_.="_1"; $foo{$_}=$foo{$rm}; delete $foo{$rm}; } chomp ($_,$foo{$_}); print "$_->$foo{$_}\n"; }
Re: Parsing Challenge
by blue_cowdawg (Monsignor) on May 24, 2001 at 22:28 UTC

    How about this:

    #!/usr/bin/perl -w ############################################################### use strict; use Data::Dumper; my $data=qq(key1=value1 key2=value2 key3=value3 key4=value4); my $v={}; foreach my $pair (split(' ',$data)) { my($key,$value)=split('=',$pair); $v->{$key}=$value; } print Dumper($v);

    Yields:

    $VAR1 = { 'key1' => 'value1', 'key2' => 'value2', 'key3' => 'value3', 'key4' => 'value4' };

    Alternatively you can do the following with the same results

    #!/usr/bin/perl -w ############################################################### use strict; use Data::Dumper; my $data=qq(key1=value1 key2=value2 key3=value3 key4=value4); my $v={}; map { my($k,$vl) = split('=',$_); $v->{$k}=$vl; } split(' ',$data); print Dumper($v);

    HTH


    Peter L. BergholdSchooner Technology Consulting, Inc.
    Peter@Berghold.Netwww.berghold.net
      Some of the keys have multiple values, though. You could remember the last key...
      foreach (split " ", $data) { if (/^([^=]*)=(.*)/) { $hash{$last=$1}=$2 } elsif (defined $lastkey) { $hash{$last} .= " $_" } }
      Since "key=" is easy to recognize, another way would be to use split:
      %hash = ("JUNK", split /([^=\s]*)=/, $data)
      if you don't mind the JUNK, that is.

        What if the literal "key" is not always present? What if the pairs become:

        my $data="fruit=apple meat=steak ....";


        Peter L. BergholdSchooner Technology Consulting, Inc.
        Peter@Berghold.Netwww.berghold.net

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://83042]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (9)
As of 2023-02-09 09:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer not to run the latest version of Perl because:







    Results (44 votes). Check out past polls.

    Notices?