Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Dive data with automatic array indexing

by Loops (Curate)
on Oct 27, 2014 at 05:25 UTC ( [id://1105077]=note: print w/replies, xml ) Need Help??


in reply to Dive data with automatic array indexing

Hi Chris, I'm not sure exactly your end goal, or whether the $VAR1 data from the Dumper output you show is valuable to you at all. But if not you might consider using one of the config file parsers from CPAN. It might take some coercing to get one to conform to your needs, but you seem to have the flexibility to change the format of your files; an automated translation should be doable. Plus you'll get the escaping and quoting features, that you mention are lacking in your code.

Here's a quick example

use strict; use warnings; use Data::Dumper; use Config::Scoped; my $cs = Config::Scoped->new(); my @lines = ("data{all = [\n", <DATA>, "]}\n"); my $Config = $cs->parse(text => join('', @lines))->{data}->{all}; print Dumper($Config); __DATA__ { name = john location = uk interests = [ programming cycling ] } # An ignored comment { name = laura interests = [ knitting tennis dancing ] } { location = canada interests = [[ dogs horses ] cars] }

Which displays:

$VAR1 = [ { 'interests' => [ 'programming', 'cycling' ], 'location' => 'uk', 'name' => 'john' }, { 'interests' => [ 'knitting', 'tennis', 'dancing' ], 'name' => 'laura' }, { 'location' => 'canada', 'interests' => [ [ 'dogs', 'horses' ], 'cars' ] } ];

If you absolutely need the keys that have undefined values as in your example output, the code can be changed to include them. It should be as easy to write a translator from your existing files into this format, as for the new syntax you are proposing. However, other CPAN options might be more to your liking.

Replies are listed 'Best First'.
Re^2: Dive data with automatic array indexing
by peterp (Sexton) on Oct 27, 2014 at 06:25 UTC

    Hi Loops,

    Firstly thank you for your reply.

    Your alternative solution is brilliant and is an option I will highly consider when I get round porting to a more practical format. I didn't know such a configuration parser existed.

    As briefly mentioned in my question, there will be complexities in doing this, and is out of scope of the task I have been assigned.

    Some of the reasons behind this are:

    - This was assigned as a quickfix task, which can be revisted when there is more time in a few months. The more I have to change, the more time in development and testing.
    - There are multiple configuration files containing unrelated data but are all parsed via the same parser. I do not have permission to update these configuration files just yet. Although implementing two parsers might be an option.
    - The particular configuration file I am dealing with contains unrelated data, notably Log::Log4Perl configuration data which as far as I am aware must be in their documented format (selector based). Mixing formats might pose an issue. Although looking at the documentation it looks as though you can init with a ref which could be derived from the configuration file.
    - The same parser is used to process application/x-www-form-urlencoded multidimensional http parameters, the task includes implementing automatic array indexing of these too whilst retaining their existing explicit array indexing usage. Therefore either way I'll have to perfect the above approach.
    etc

    Lastly, I haven't got around to properly going through the production code that handles this stuff just yet, the demo was just something I devised in my own time whilst fresh in my head. I certainly hope the todos I marked have already been implemented!

    Chris

      Okay. Have a better appreciation of your constraints. Have played around with your code a bit and I think it covers all of the cases that you showed well. There may be a bit of an opportunity to make the config file syntax a little less visually cluttered though.

      If you can trust that there are empty lines between the blocks and not within, you can strip off the need for the > or < symbols at the start of each line. Also if you just assume an empty selector means >, then you only have to place the < selectors in cases where you don't want to nudge the index forward. Check out the example DATA below the code. The code will still parse your original example as well if you don't like these changes.

      All bugs are mine after twisting your code around like this:

      use strict; use warnings; use Data::Diver qw( DiveVal DiveError ); use Data::Dumper qw( Dumper ); my $state = { }; my $ref = [ ]; my $prefix = '<'; while (<DATA>) { chomp; next if /^\s*#.*$/; if (/^\s*$/) { $prefix = '>'; next; }; my ($selector, $value) = split /\s*=\s*/; next unless defined $selector; my @selector = split /\./, $selector =~ s/\.$/.>/r; unshift(@selector, $prefix) unless $selector =~ /^[><]\./; _dive( $ref, \@selector, $value ); $prefix = '<'; } print Dumper $ref; sub _dive { my ( $ref, $selector, $value ) = @_; return unless defined $ref and defined $selector and scalar @$sele +ctor; my @selector_b; for ( @$selector ) { if ( /^(>|<|)$/ ) { my $selector_b = join '.', @selector_b; if ( $1 eq '<') { push @selector_b, $state->{$selector_b} //= 0; } else { $state->{$selector_b} += defined $state->{$selector_b} + ? 1 : 0; push @selector_b, $state->{$selector_b}; } } else { push @selector_b, $_; } } DiveVal( $ref, @selector_b ) = $value; my ( $error ) = DiveError( ); $error and warn $error; return 1; } __DATA__ name = john location = uk interests. = programming interests. = cycling name = laura location = interests. = knitting interests. = tennis interests. = dancing name location = canada interests.. = dogs interests.<. = horses interests. = cars # test.error = blah

        Hi Loops,

        Thank you once again.

        I haven't yet tested your version, but have spent some time reading / understanding its functionality. It appears to be a much more elegant solution. I particularly approve its ability to try to do the right thing when a selectee is blank / missing, this will be beneficial in shortening long url query strings, I now wonder if its possible to do something similar with repetitive hash keys. I also thought the block handling the += was nicely thought out, I was unable to come up with a one line notation myself, though I will probably combine the push to remain consistent with the corresponding if. Finally, due to using an outdated Perl version, I will have to write an alternative to the regexp r flag (handle in two steps, or use do).

        I will have to perform vigorous testing and get back to you if I have any outstanding concerns, but this was exactly what I needed, you've killed two birds with one two stones between your two replies, I'm very grateful for your expertise and time. I'm also working on an _undive function (reverse dive / ref to selector), which is proving to be tricky, but I'm getting there.

        Chris

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1105077]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-25 19:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found