Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

losing html in print calling function.

by Librum (Initiate)
on Nov 21, 2011 at 01:01 UTC ( [id://939120]=perlquestion: print w/replies, xml ) Need Help??

Librum has asked for the wisdom of the Perl Monks concerning the following question:

I am a noob at Perl, and I have a 'glitch' I just can not figure out. The code is for a mini site search, where we identify each file to be searched, one by one.

Here is a sample of some calling print statements...

if (open(THE_FILE, $prestring . "WWM.txt")){print "<center><img bor +der=0 src=/article_separator.png><br><B>WWM: Windmills and Wind Motor +s. 1910.</B></center><br>";fn1();} else {print "<center><img border=0 + src=/article_separator.png><br><b>Windmills and Wind Motors, 1910,</ +b> has no ATOCI.</b></center><br>"}; if (open(THE_FILE, $prestring . "ycmi.htm")){print "<center><img bo +rder=0 src=/article_separator.png><br><B>YCMI: You Can Make It. 1929- +1931.</b></center><br>";fn1();} else {print "<center>YCMI file not fo +und.</center><br>"}; if (open(THE_FILE, $prestring . "youngs.htm")) {print "<center><img + border=0 src=/article_separator.png><br><B>YOUNGS: Youngs Demonstrat +ive Translation of Scientific Secrets. 1861.</b></center><br>";fn1(); +} else {print "<center>Youngs.htm Missing.</center><br>"};

And the function they call:

sub fn1 { while (<THE_FILE>) { $record = $_; $_ = uc($_); if (/$ucsearch/) { print "$record"; $count++; } } close (THE_FILE); }

Now, about bingo 200, we lose the html from the calling print lines in the output, but not the text. Plus the process completes. It is not the server, as it does this both under Linux (web server) and XP XAMPP. It is not data slipping in and glitching, as the calling lines were rearranged, and still faulted around bingo 200. I think it is some setting, or buffer, but have no clue...

Live example can be seen at http://www.librum.us/1-card.htm

For a good run, use 'grape'. For a 200+ bingo type 'apple'.

Help?

Sarah, of the Librum

Replies are listed 'Best First'.
Re: losing html in print calling function.
by chromatic (Archbishop) on Nov 21, 2011 at 01:59 UTC

    Within the function, you're modifying variables not declared within the function—is it possible you're clobbering variables declared elsewhere? In specific, the use of $record looks suspicious. What happens if you say:

    sub fn1 { my ($fh, $ucsearch) = @_; my $count = 0; while (my $record = <$fh>) { next unless uc($record) =~ /$ucsearch/; print $record; $count++; } return $count; }

    You'll have to change how you call the function (and you'll have to use its return value), but the lexical encapsulation may help avoid weird action at a distance.

    If that doesn't help, you'll have to show more code.


    Improve your skills with Modern Perl: the free book.

      Apple seemed to work fine for me on your search page

      Definitely a +1 from me on Chromatic's suggestion, which made your intention much clearer to me.

      It is wise to encapsulate actions as suggested and I would also urge you to review the way you are managing the HTML. If you decide at some point in the future to modify the HTML output, then you've got a bug maintenance/consistency problem.

      My principle is that if I start copying and pasting bits of code then it needs to be extracted and made a subroutine. The first if statement block is similar all the way through with the content variation dependent on the file name. The second block, aka the else block, differs but is presumably still a function of the file + some other component.

      So I'd take chromatic's code and add in

      sub fn1 { my ($fh, $ucsearch ) = @_; my $count = 0; while (my $record = <$fh>) { next unless uc($record) =~ /$ucsearch/; print $record; $count++; } return $count; } ## data is a reference to an array of hashes data structure containing ## the full file name, the file basename, title, and something that + helps decide what you are going to print if the file doesn't open. ## setdata is a subroutine which creates and fills the data structure, + then returns a reference to it ## hint: look the core module File::Basename (http://perldoc.perl.or +g/File/Basename.html) to extract the file basename my $data=setdata(); foreach my $datum (@$data) ( if (open(my $fh, $prestring.$data->{'filename'}) { print "<center><img border=0 src=/article_separator.png><br>< +B>".$data->{'basename'}.":".$data->{'title'}."</B></center><br>"; fn1($fh,$ucsearch); } else { ## print according to the characteristic that decides if the fil +e has no ATOCI or does not exist } }

      That helps with maintaining the HTML a little. It would be better to then look at HTML::Template, (which has a tutorial here), for a way to separate your HTML from your code.

        Apple now works.

        The problem has cleared, but I am not sure just which change fixed it. Chromatic I think hit it, I changed from $record to $record1 inside the function.

        Now I have inconsistancy in how many page breaks are displayed. I suspect a string trim of np characters function will need to be added.

        Let me experiment a bit more before I report back.

      Chromatic,

      No, no redeclaring of functions declared elsewhere. Triple checked. But I think you have pointed me in the right directions. the $ucsearch function... I wonder if it uses the $record

      Going back to test that now...

      Thanks!

Re: losing html in print calling function.
by remiah (Hermit) on Nov 21, 2011 at 02:22 UTC
    I am sorry for my poor understanding of your problem. If you allow me to ask...

    we lose the html from the calling print lines in the output, but not the text

    What does it mean your "losing html"? I guess you mean html as "if/else" block's img tags, but I couldn't understand where it is when I saw "apple"'s output with my browser. "not the text" means you don't care if print "$record" prints nothing?

      Remiah,

      I knew that was clumsy, let me try again.

      this is one of the calling print lines.

      if (open(THE_FILE, $prestring . "WWM.txt")){print "<center><img border +=0 src=/article_separator.png><br><B>WWM: Windmills and Wind Motors. +1910.</B></center><br>";fn1();} else {print "<center><img border=0 sr +c=/article_separator.png><br><b>Windmills and Wind Motors, 1910,</b> +has no ATOCI.</b></center><br>"};

      In the output, it does not render the center, the img, the line break, the bold. It does render the "WWM: Windmills and Wind Motors. 1910". Then no render of the nobold, the nocenter, the line break. It does render from the function print line html tags.

      Back to testing... And thanks for responding.

        It does seem a strange bug. Have you checked the value of $prestring? I would not suspect fn1 in this case.

        I took another look at your page and checked the source again.

        Where a list of results occurs, e.g search for apple and look at the data found in "FCOA: Farmers Cyclopedia of Agriculture. 1911"; there is malformed HTML

        <center><img border=0 src=/article_separator.png><br><B>FCOA: Farmers +Cyclopedia of Agriculture. 1911.</b></center><br><center><TR><td>APPL +E&#65533;Root-grafting, root vs. top-grafting, location and soil, pla +nting the trees, cultivation and cover crops, manuring, pruning, harv +esting and storage, utilization of waste apples, varieties, general s +cheme for spraying apple trees, enemies</td><td> 216</td></tr> <br><center><TR><td>PINEAPPLE&#65533;Locations, propagation, fertilize +rs, growing under sheds, varieties, enemies</td><td> 287</td></tr> <br><center>Apples, 216<Br> <br><center>Apples, Aphis, 228<Br> <br><center>Apples, Bitter Rot, 223<Br> <br><center>Apples, Borers, 226, 270<Br> <br><center>Apples, Brown Rot, 224, 275<Br> <br><center>Apples, Bud Moth, 227<Br>Rot, 223<Br>

        with some table markup being thrown in and no closing center tag on the record data returned from the file.

        Your initial file title is there

        <center><img border=0 src=/article_separator.png><br><B>FCOA: Farmers +Cyclopedia of Agriculture. 1911.</b></center><br>

        but I would assume the sudden appearance of table markup followed by unterminated center tags isn't helping your output. In this case I would say the error is probably in your original fn1 function where you print the record out and the table markup is in the data you pick up. Neither of these is a Perl problem although they can be solved with Perl

        If I haven't identified your problem, would you please post a snippet of HTML that shows the problem?

Re: losing html in print calling function.
by Util (Priest) on Nov 21, 2011 at 17:27 UTC
    Here are a few more pointers:
    • LesleyB++ ; I second that templating advice.
    • You are using the search string directly as a regular expression, which is probably not what you mean to do. For example, if I search for "a..le", I get the hits for "apple", "battle", and more. As I demonstrate below, the quotemeta function can be used to escape special characters in the string, forcing it to just be a simple test search.
    • Since your HTML form is (appropriately) using GET instead of POST, you can easily split your testing into two pieces from the commandline:
      1. curl -o test_results.html 'http://www.librum.us/cgi-bin/satoci.cgi?sitem=apple'
      2. open test_results.html (or just `test_results.html`, or double-click on the file; however you open a document on your system)
      You can inspect test_results.html in a editor, to look for problems with the HTML, or make sure that what your HTTP server sent is what your `curl` client recieved, or to change the HTML by hand to see what effect such changes make to the rendered output. This is what allowed me to quickly find your CENTER tag problem.
    • You are using lots of repeated code, such as many calls to open and fn1(). The DRY principle will guide you to separate the parts that differ in each copy from the parts that are identical across all the copies. The Perl docs on Array of Arrays can help you understand how to manipulate the data that is extracted from each of the copies, such that you have a single call to open and and a single call to fn1(), but within a loop that walks over a list of the books. This will also clean up many inconsistencies in your output, such as "Youngs.htm Missing" vs "YCMI file not found" vs "has no ATOCI".

    To explore your original problem, I refactored your code into the style that I would write (except that I would go further to use some template module). I include it below, in case anyone finds it useful.

    #!/usr/bin/env perl use strict; use warnings; use Carp; my $prestring = '???'; $prestring = '.'; # Overridden for PerlMonks testing my $wanted_string = 'apple'; # really take from param 'sitem'; my $index_dir = $prestring; my @book_indexes_titles = ( [ '1800.txt' , '1800: Mechanical Movements, Powers and Devices, +1911' ], [ '507.txt' , '507: Five Hundred and Seven Mechanical Movements +. 1893' ], [ '970.txt' , '970: Mechanical Appliances, Mechanical Movements + and Novelties of Construction. 1904' ], [ 'ACAM.txt' , 'ACAM: Appletons Cyclopedia of Applied Mechanics, + 1880 ' ], [ '???.txt' , 'Audels Carpenters and Builders Guide I' ], [ '???.txt' , 'Audels Carpenters and Builders Guide II' ], [ '???.txt' , 'Audels Carpenters and Builders Guide III' ], [ '???.txt' , 'Audels Carpenters and Builders Guide IV' ], [ 'AMOM.txt' , 'AMOM: A Manual of Mending and Repairing. 1907' ] +, [ '???.txt' , 'Farm Blacksmithing (1901)' ], [ 'FCOA.txt' , 'FCOA: Farmers Cyclopedia of Agriculture. 1911' ] +, [ 'FCOLS.txt' , 'FCOLS: Farmers Cyclopedia of Live Stock. 1908' ] +, [ 'FDAD.txt' , 'FDAD: Furniture Design and Draughting. 1900' ], [ '???.txt' , 'ICS Reference 14' ], [ '???.txt' , 'Pottery for /artists, Craftsmen, and Teachers (1 +930)' ], [ '???.txt' , 'Practical Poultry Production (1910)' ], [ 'TJCB.txt' , 'TJCB: Thomas Jeffersons Cook Book. 1937' ], [ 'TMOL.txt' , 'TMOL: The Making of Leather. 1914' ], [ 'TPSD.txt' , 'TPSD: The Practical Stock Doctor. 1912' ], [ 'TSC.txt' , 'TSC: The Settlement Cookbook. 1903' ], [ 'TSI.txt' , 'TSI: The Shoe Industry. 1916' ], [ '???.txt' , 'Univeral Household Assistant (1884)' ], [ 'VIRG.txt' , 'VIRG: Virginia Housewife or Methodical Cook. 189 +9' ], [ 'WBH.txt' , 'WBH: Window Blinds. 1907. (Hasluck)' ], [ '???.txt' , 'Wood Finishing, 1903' ], [ 'WHCB.txt' , 'WHCB: White House Cook Book. 1887' ], [ 'WWM.txt' , 'Windmills and Wind Motors, 1910' ], [ 'YCMI.htm' , 'YCMI: You Can Make It. 1929-1931' ], [ 'YOUNGS.htm' , 'YOUNGS: Youngs Demonstrative Translation of Scie +ntific Secrets. 1861' ], ); sub search_the_file { croak("Wrong number of arguments") if @_ != 2; my ( $fh, $search_string ) = @_; my $search_qm = quotemeta $search_string; my $search_re = qr{$search_qm}i; # i for insensitive-to-case my @found; while (<$fh>) { chomp; push @found, $_ if uc($_) =~ /$search_re/; } return @found; } my $separator_line = '<center><img border="0" src="/article_separator.png"></center>< +br>' . "\n"; $separator_line = "---\n"; # Overridden for PerlMonks testing my $count_total = 0; BOOK: for my $book_aref (@book_indexes_titles) { my ( $index_file, $book_title ) = @{$book_aref}; print $separator_line; open my $fh, '<', "$index_dir/$index_file" or do { print "<center>ATOCI not found for book <b>$book_title</b></ce +nter><br>\n"; next BOOK; }; my @lines = search_the_file( $fh, $wanted_string ); close $fh; $count_total += scalar @lines; print "<center><b>$book_title</b></center><br>\n"; for my $line (@lines) { print "<center>$line</center><br>\n"; } } print $separator_line; print "$count_total records found.<br>\n"; print $separator_line;

      Thanks for all the responses.

      All good points.

      And a found another, and pass it on for what it may be worth. I had two bang statements, one for localhost and one for server, with the unused one double hashed to comment out. The double hash does not comment it out.

      It works fine now, in localhost. Now I am awaiting the host provider to fix a configuration problem and that should be that.

      Thanks again.

Re: losing html in print calling function.
by Util (Priest) on Nov 21, 2011 at 16:44 UTC
    For other monks: The rendering problem shows up for me in Firefox, but not Opera or Safari.

    Your HTML looks like this:

    <br><center>Apples, 216<Br> <br><center>Apples, Aphis, 228<Br> <br><center>Apples, Bitter Rot, 223<Br> <br><center>Apples, Borers, 226, 270<Br> <br><center>Apples, Brown Rot, 224, 275<Br>
    In HTML, the BR tag, like an IMG tag, stands alone; no closing /BR is needed, or even allowed. The CENTER tag, like most tags in HTML, is *paired*, and requires a closing /CENTER tag. By using so many (over 256?) CENTER tags without closing them, you are creating so many nested levels for the browser to keep track of, that it gives up trying.

    When I changed your HTML to close (or "balance") the tags, it renders properly:

    <center>Apples, 216</center><br> <center>Apples, Aphis, 228</center><br> <center>Apples, Bitter Rot, 223</center><br> <center>Apples, Borers, 226, 270</center><br> <center>Apples, Brown Rot, 224, 275</center><br>

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://939120]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2024-04-23 16:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found