Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^3: Unable to get the paragraph in the list of hashes. Getting single lines instead.

by GrandFather (Saint)
on Sep 20, 2020 at 22:58 UTC ( [id://11121978]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Unable to get the paragraph in the list of hashes. Getting single lines instead.
in thread Unable to get the paragraph in the list of hashes. Getting single lines instead.

You can break out the record fields by processing the record string as a list of lines:

use strict; use warnings; use Data::Dumper; my %records; local $/ = "\n\n"; while (<DATA>) { next if !/^(\d+):(.*)/s; my ($id, $tail) = ($1, $2); local $/ = "\n"; open my $recIn, '<', \$tail; while (<$recIn>) { chomp; #next if !/(\w+)\s*=\s*(.*)/; # #$records{$id}{$1} = $2; next if !/^\s*?([^=]+)\s*=\s*(.*)/; my ($key, $value) = ($1, $2); s/^\s+|\s+$//g for $key, $value; $records{$id}{$key} = $value; } } print Dumper(\%records); __DATA__ ...

Using the previous data prints:

$VAR1 = { '1' => { 'Capacity' => '288196762624 (268.4G)', 'WWN' => '06:00:00:00:05:00:00:00:00:00:00:00:00:00 +:00:03', 'Pool' => 'performance', 'Model' => 'STE30065 CLAR300', 'Maximum speed' => '6 Gbps', 'Health details' => '"The component is operating no +rmally. No action is required."', 'Vendor capacity' => '322122547200 (300.0G)', 'Part number' => '005049273', 'Enclosure' => 'DPE_0', 'Health state' => 'OK (5)', 'Serial number' => '6SJ2C6MV', 'Slot' => '0', 'Type' => 'SAS', 'User capacity' => '236420176896 (220.2G)', 'ID' => 'disk_dpe_0_0', 'Manufacturer' => 'SEAGATE', 'Name' => 'DPE Disk 0', 'Current speed' => '6 Gbps', 'Rotational speed' => '15000 rpm', 'Firmware revision' => 'ES0E' }, '2' => { 'WWN' => '06:00:00:00:05:00:00:00:01:00:00:00:01:00 +:00:03', 'Pool' => 'performance', 'Capacity' => '288196762624 (268.4G)', 'Slot' => '1', 'Serial number' => '6SJ28QF3', 'Health state' => 'OK (5)', 'Part number' => '005049273', 'Enclosure' => 'DPE_0', 'Model' => 'STE30065 CLAR300', 'Maximum speed' => '6 Gbps', 'Health details' => '"The component is operating no +rmally. No action is required."', 'Vendor capacity' => '322122547200 (300.0G)', 'ID' => 'disk_dpe_0_1', 'User capacity' => '236420176896 (220.2G)', 'Manufacturer' => 'SEAGATE', 'Type' => 'SAS', 'Rotational speed' => '15000 rpm', 'Firmware revision' => 'ES0E', 'Current speed' => '6 Gbps', 'Name' => 'DPE Disk 1' } };

Note that local overrides the variable's value just for the current block so changing $/ inside the main loop doesn't affect while (<DATA>). open my $recIn, '<', \$tail; treats $tail as a file.

Update: Replaced commented out code to fix the single word matches for keys issue caught by tybalt89.

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

Replies are listed 'Best First'.
Re^4: Unable to get the paragraph in the list of hashes. Getting single lines instead.
by tybalt89 (Monsignor) on Sep 21, 2020 at 06:48 UTC

    This only gets the last word of multi-word keys - i.e. "speed" - when there are three different "speeds" in the data.

      Nice catch! Updated in the node.

      Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re^4: Unable to get the paragraph in the list of hashes. Getting single lines instead.
by pritesh_ugrankar (Monk) on Sep 21, 2020 at 19:49 UTC

    Hi,

    Sorry for the late reply. Just logged in after work. This is awesome!!. Truly a genius solution!!. There's a lot I've learnt from your code.

    Just so that I get it right, when you say next if !/^(\d+):(.*)/s; it means, just move next if you find a line that starts with one or more digits, followed by a colon, and then some stuff. I am not sure what the /s does though, does it mean "spill" this regex over even if there is a new line?

    Further down, the next if !/^\s*?([^=]+)\s*=\s*(.*)/ I guess means

    next if ! -> Move to the next line if the line does NOT / ^\s*? -> begin with one or more space (? makes this lazy I guess) ([^=]+)-> Does not include the literal "equal to" sign & create captur +e group of whatever text is there. \s* -> Some more space. = -> Literal "equal to" \s* -> Some more space. (.*) -> Second capture group of the remaining stuff. /

    Please let me know if my understanding is right.

    The lines  my ($id, $tail) = ($1, $2) and the entire code in the second while loop is simply amazing and an eye opener!! I have no words to express my gratitude for showing this amazing stuff!!

      next if !/^(\d+):(.*)/s; skip the remainder of the loop body and start the next itteration - in this case, deal with the next record.

      The \s*? isn't actually required at all. The ? does mean "lazy" - match the fewest white space characters possible and still match. See perlre for regular expression documentation.

      The /s (s at the end of the regex) means treat the string as a single line. That captures the remainder of the record regardless of the line breaks in it.

      next and last are loop control statements. They skip to the next iteration or terminate the loop respectively. They are very important to understand to help write clear succinct code.

      Your analysis of the regex is spot on. The regex is not so great though. In particular, both the \s* matches could be omitted because they actually do nothing - maybe my coffee levels were dropping by the time I wrote that?

      Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11121978]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-28 15:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found