Parse a file and store it in hash of hashes

Sonali has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Parse a file and store it in hash of hashes by afoken (Chancellor) on Jan 16, 2017 at 07:12 UTC
Hi. You are aware that perlmonks is neither a code writing service nor a job exchange, aren't you? Show what you tried so far, and we'll help you with the remaining problems. A great part of using perl is using CPAN. The file format looks very much like a Windows INI file, and that's a solved problem. Go to http://search.cpan.org and search for "INI". You will find many modules that can handle those files. Follow the links to the module documentation and find the one that fits best. Then open a command prompt, and type `cpan install Your::Favorite::INI::Module`. If you insist on reinventing the wheel, but have no code yet, look up the documentation of strict, warnings, open, readline, split, autodie, and at least try to write a piece of code that reads the file line by line and splits the data line into key and value. As with most other "simple" computer problems: Explain the problem in plain english, as you would for a very stupid human. As in: "Open the file foobar.ini. If that fails, stop. Else, read a line ...". From there, translating english to any computer language is quite easy. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply] [d/l]
Re^2: Parse a file and store it in hash of hashes by Sonali (Novice) on Jan 16, 2017 at 08:24 UTC
This is my code snippet. When i run this program there is no output at all. I am not able to figure out why `#!/usr/local/bin/perl use strict; use warnings; use Data::Dumper; my $filename = 'tester.txt'; my %HoH; my $key; my $value; open(my $fh, '<:encoding(UTF-8)', $filename) or die "Could not open file '$filename' $!"; while ( <$fh> ) { next unless s/^\[(.?)\]\s//; $rec = $1; for my $field ( split /\n/) { ($key, $value) = split /\s=\s/, $field; $HoH{$rec}{$key} = $value; } } print Dumper %HoH;` [download]	[reply] [d/l]
Re^3: Parse a file and store it in hash of hashes by Corion (Patriarch) on Jan 16, 2017 at 08:34 UTC
I find this highly unlikely. When I run your code, I get the following output: `Global symbol "$rec" requires explicit package name at q:\tmp.pl line +12. Global symbol "$rec" requires explicit package name at q:\tmp.pl line +15. Execution of q:\tmp.pl aborted due to compilation errors.` [download] If I declare `$rec` as lexical variable and create an empty filename `tester.txt`, I get no output. This is because you're not using Data::Dumper properly `print Dumper %HoH; # should be print Dumper \%HoH;` [download] Please post the actual code you are using. Also, look at Config::IniFiles, which does all of what you're doing already.	[reply] [d/l] [select]
Re^4: Parse a file and store it in hash of hashes by Sonali (Novice) on Jan 16, 2017 at 10:07 UTC
Re^3: Parse a file and store it in hash of hashes by Discipulus (Canon) on Jan 16, 2017 at 08:43 UTC
Hello Sonali and welcome to the monastery and to wonderful world of Perl! First of all follow the wise suggestions of the precise monk afoken. That said, with the code you posted, and in particular `$rec = $1` I get the error `Global symbol "$rec" requires explicit package name at pm16012017.pl line 12.` but is probably a typo. In addition i think you just need a hash not a HashOfHash. Now about your code: if `next unless s/^\[(.?)\]\s//;` is intended to skip the first line must probably be: `next if s/^\[(.?)\]\s//;` Even with this you get errors about undefined values: `Use of uninitialized value in hash element at inifile16012017.pl line 15,` foreach line of data and the following datastructure: `$VAR1 = ''; $VAR2 = { 'FIFTH' => '12345', 'COMMENT' => '"Perl parsing"', 'SEVENTH' => 'QWERTY', 'FOURTH' => '"RANDOM"', 'SECOND' => '"ID"', 'FIRST' => '"TEST"', 'THIRD' => '123', 'SIXTH' => '6789' };` [download] If you intended to have `CELL_NAME` as root element you need to not skip the line with it and have `$rec` declared outside the loop, to have it ad disposal during the loop: `my $rec; while ( <$fh> ) { if (s/^\[(.?)\]\s//){$rec = $1}` [download] The resulting datastructure (dumped with Data::Dump with `dd` prettier method) will be: `( "CELL_NAME", { COMMENT => "\"Perl parsing\"", FIFTH => 12345, FIRST => "\"TEST\"", FOURTH => "\"RANDOM\"", SECOND => "\"ID\"", SEVENTH => "QWERTY", SIXTH => 6789, THIRD => 123, }, )` [download] L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re^4: Parse a file and store it in hash of hashes by Sonali (Novice) on Jan 16, 2017 at 09:51 UTC
Re^5: Parse a file and store it in hash of hashes by Discipulus (Canon) on Jan 16, 2017 at 10:06 UTC
Some notes below your chosen depth have not been shown here
Re: Parse a file and store it in hash of hashes by 1nickt (Canon) on Jan 16, 2017 at 14:09 UTC
Hello Sonali, I appreciate that you are trying to learn how to do some basics in Perl, and you want to understand how things work. But one of the very best reasons to use a CPAN module for a common task is that it has probably considered all the "edge cases" that you might encounter in your data. Reputable CPAN modules come with a test suite that demonstrates this. So the risk of writing your own solution is that you may miss a special case, and you won't have a test for it to reveal your error. At the least you should compare the results you get with the results from another processor. Here is a solution using Config::Tiny::Ordered: `use strict; use warnings; use Config::Tiny::Ordered; my $file = '1179628.txt'; my $config = Config::Tiny::Ordered->read( $file ); foreach my $section_name( sort keys %{ $config } ) { print "SECTION: $section_name\n"; foreach my $item( @{ $config->{ $section_name } } ) { printf ( " %7s : %s \n", $item->{'key'}, $item->{'value'} ); } print "\n"; } __END__` [download] Output: `$ perl 1179628.pl SECTION: CELL_NAME1 COMMENT : "Perl parsing" FIRST : "TEST1" SECOND : "ID1" THIRD : 123 FOURTH : "THREE" FIFTH : 12345 SIXTH : 6789 SEVENTH : QWERTY SECTION: CELL_NAME2 COMMENT : "Tester" FIRST : "TEST2" SECOND : "ID2" THIRD : 1234 FOURTH : "FOUR" FIFTH : 12345 SIXTH : BOARD SEVENTH : MOUSE SECTION: CELL_NAME3 COMMENT : "Parser" FIRST : "TEST3" SECOND : "ID3" THIRD : 12345 FOURTH : "FIVE" FIFTH : 12345 SIXTH : PAD SEVENTH : KEY` [download] Hope this helps! The way forward always starts with a minimal test.	[reply] [d/l] [select]
Re^2: Parse a file and store it in hash of hashes by Sonali (Novice) on Jan 17, 2017 at 04:00 UTC
Yes I tried it and it is way easier. Thank you!	[reply]
Re: Parse a file and store it in hash of hashes by rahulruns (Scribe) on Jan 16, 2017 at 10:13 UTC
Few Ideas, you data will look arranged in a better way if you use xml rather than plain file, with that you could parse your xml and store elements of xml as key and value of that element as value for your key. If you want to use only from a plain file you could split on basis of = sign and store them in hash look something like `while my $line (<$fh>){ my ($a, $b) = split (/=/, $line); $hash{$a} = $b ; }` [download] Remember this is not complete code, it is just to give you hint	[reply] [d/l]
Re^2: Parse a file and store it in hash of hashes by Sonali (Novice) on Jan 16, 2017 at 10:21 UTC
No I dont want it in XML format. Thanks for the hint!!	[reply]
Re: Parse a file and store it in hash of hashes by tybalt89 (Monsignor) on Jan 16, 2017 at 20:17 UTC
I find YAML is usually easier to read for debugging structures than Data::Dumper #!/usr/bin/perl # http://perlmonks.org/?node_id=1179628 use strict; use warnings; my $rec; my %HoH; while(<DATA>) { if( /^\[(.?)\]/ ) { $rec = $1; } elsif( defined $rec and /(\S+)\s=\s("."\|\S+)/ ) { $HoH{$rec}{$1} = $2; } } use YAML; print Dump \%HoH; __DATA__ [CELL_NAME1] COMMENT = "Perl parsing" FIRST = "TEST1" SECOND = "ID1" THIRD = 123 FOURTH = "THREE" FIFTH = 12345 SIXTH = 6789 SEVENTH = QWERTY [CELL_NAME2] COMMENT = "Tester" FIRST = "TEST2" SECOND = "ID2" THIRD = 1234 FOURTH = "FOUR" FIFTH = 12345 SIXTH = BOARD SEVENTH = MOUSE [CELL_NAME3] COMMENT = "Parser" FIRST = "TEST3" SECOND = "ID3" THIRD = 12345 FOURTH = "FIVE" FIFTH = 12345 SIXTH = PAD SEVENTH = KEY [download]	[reply] [d/l]


No such thing as a small change
	PerlMonks