perly_white has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file of the form: Filename1 Item1 - Answer Filename1 Item2 - Answer .... Filename1000 Item1 - Answer I am trying to create an individual HTML file with a table for each item for each file. Obviously I need a loop. However, I am unsure how to read and ignore the repetitive format of the file in which Filename1 occurs on every single line item related to Filename1, etc. I need to know the filename because whenever, I encounter a new filename, it is time to save the existing HTML file and begin a new table for the next file's items. I don't want a table with the filename in each row so I want to ignore it after the first occurrence. However, I need to keep reading because I need the Item value from each line. Any suggestions on how to handle this? Thanks so much!
  • Comment on How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?

Replies are listed 'Best First'.
Re: How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?
by Athanasius (Archbishop) on Apr 23, 2016 at 03:40 UTC

    Hello perly_white, and welcome to the Monastery!

    The requirements are not entirely clear. If filenames can appear out-of-order in the input file:

    Filename1 Item1 - Answer Filename2 Item1 - Answer Filename1 Item2 - Answer

    then you will need to either (1) read the whole file into a suitable data structure before writing tables, or (2) keep track of each open file, associating the filename with the handle. For (1), you could use a hash of arrays1 like this:

    my %files; $files{Filename1} = [ 'Item1 - Answer' ]; push @{ $files{Filename1} }, 'Item2 - Answer'; ...

    For (2), you would need a simple hash with filename/filehandle key/value pairs.

    However, it appears from the question that you know in advance that filenames cannot appear out-of-order. If that’s the case, the following skeleton script should provide a straightforward approach:

    use strict; use warnings; use autodie; # open the data file for reading my $data_filename = 'data.txt'; open my $in_fh, '<', $data_filename; # output files my $current_filename = ''; my $out_fh; while (<$in_fh>) # process one line of data { my ($new_filename, $item) = split ' ', $_, 2; if ($new_filename ne $current_filename) { finalize_table($out_fh) if defined $out_fh; open $out_fh, '>', $new_filename; $current_filename = $new_filename; initialize_table($out_fh); } add_row($out_fh, $item); } close $in_fh; finalize_table($out_fh) if defined $out_fh; sub initialize_table { ... } sub add_row { ... } sub finalize_table { my ($fh) = @_; # ... close $out_fh; }

    Update: 1See perldsc#HASHES-OF-ARRAYS.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?
by tangent (Parson) on Apr 23, 2016 at 10:45 UTC
    This is a way you could do it using Template Toolkit. Using a template system allows you to keep the HTML out of your code and will also handle writing the new files for you:
    use Template; my $template = Template->new; my $tmpl = 'table.tmpl'; my %names; while ( my $line = <DATA> ) { chomp $line; my ($name,$item,$answer) = ($line =~ m/^(\w+)\s+(\w+)\s*-\s*(.*)/) +; push( @{ $names{$name} }, { item=>$item, answer=>$answer } ); } for my $name ( keys %names ) { my $table = { title=>$name, rows=>$names{$name} }; $template->process( $tmpl, $table, "$name.html" ) || die $template->error(); } __DATA__ Filename1 Item1 - Answer Filename1 Item2 - Answer Filename2 Item1 - Answer Filename2 Item2 - Answer
    This will create two new files "Filename1.html" and "Filename2.html".

    The content part of the template file "table.tmpl" would look like this:
    <h1>[% title %]</h1> <table> [% FOREACH row IN rows %] <tr> <td>[% row.item %]</td> <td>[% row.answer %]</td> </tr> [% END %] </table>
    Obviously, you need to add the html and body tags around this, and you can also add CSS and other static elements.
Re: How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?
by Anonymous Monk on Apr 23, 2016 at 00:13 UTC
    provide representative sample data in code tags, 20 lines max