losing html in print calling function.

Librum has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: losing html in print calling function. by chromatic (Archbishop) on Nov 21, 2011 at 01:59 UTC
Within the function, you're modifying variables not declared within the function—is it possible you're clobbering variables declared elsewhere? In specific, the use of `$record` looks suspicious. What happens if you say: `sub fn1 { my ($fh, $ucsearch) = @_; my $count = 0; while (my $record = <$fh>) { next unless uc($record) =~ /$ucsearch/; print $record; $count++; } return $count; }` [download] You'll have to change how you call the function (and you'll have to use its return value), but the lexical encapsulation may help avoid weird action at a distance. If that doesn't help, you'll have to show more code. Improve your skills with Modern Perl: the free book.	[reply] [d/l] [select]
Re^2: losing html in print calling function. by LesleyB (Friar) on Nov 21, 2011 at 16:17 UTC
Apple seemed to work fine for me on your search page Definitely a +1 from me on Chromatic's suggestion, which made your intention much clearer to me. It is wise to encapsulate actions as suggested and I would also urge you to review the way you are managing the HTML. If you decide at some point in the future to modify the HTML output, then you've got a bug maintenance/consistency problem. My principle is that if I start copying and pasting bits of code then it needs to be extracted and made a subroutine. The first if statement block is similar all the way through with the content variation dependent on the file name. The second block, aka the else block, differs but is presumably still a function of the file + some other component. So I'd take chromatic's code and add in sub fn1 { my ($fh, $ucsearch ) = @_; my $count = 0; while (my $record = <$fh>) { next unless uc($record) =~ /$ucsearch/; print $record; $count++; } return $count; } ## data is a reference to an array of hashes data structure containing ## the full file name, the file basename, title, and something that + helps decide what you are going to print if the file doesn't open. ## setdata is a subroutine which creates and fills the data structure, + then returns a reference to it ## hint: look the core module File::Basename (http://perldoc.perl.or +g/File/Basename.html) to extract the file basename my $data=setdata(); foreach my $datum (@$data) ( if (open(my $fh, $prestring.$data->{'filename'}) { print "<center><img border=0 src=/article_separator.png><br>< +B>".$data->{'basename'}.":".$data->{'title'}."</B></center><br>"; fn1($fh,$ucsearch); } else { ## print according to the characteristic that decides if the fil +e has no ATOCI or does not exist } } [download] That helps with maintaining the HTML a little. It would be better to then look at HTML::Template, (which has a tutorial here), for a way to separate your HTML from your code.	[reply] [d/l]
Re^3: losing html in print calling function. by Librum (Initiate) on Nov 21, 2011 at 16:29 UTC
Apple now works. The problem has cleared, but I am not sure just which change fixed it. Chromatic I think hit it, I changed from `$record` to `$record1` inside the function. Now I have inconsistancy in how many page breaks are displayed. I suspect a string trim of np characters function will need to be added. Let me experiment a bit more before I report back.	[reply] [d/l] [select]
Re^2: losing html in print calling function. by Librum (Initiate) on Nov 21, 2011 at 16:17 UTC
Chromatic, No, no redeclaring of functions declared elsewhere. Triple checked. But I think you have pointed me in the right directions. the `$ucsearch` function... I wonder if it uses the `$record` Going back to test that now... Thanks!	[reply] [d/l] [select]
Re: losing html in print calling function. by remiah (Hermit) on Nov 21, 2011 at 02:22 UTC
I am sorry for my poor understanding of your problem. If you allow me to ask... we lose the html from the calling print lines in the output, but not the text What does it mean your "losing html"? I guess you mean html as "if/else" block's img tags, but I couldn't understand where it is when I saw "apple"'s output with my browser. "not the text" means you don't care if print "$record" prints nothing?	[reply]
Re^2: losing html in print calling function. by Librum (Initiate) on Nov 21, 2011 at 16:10 UTC
Remiah, I knew that was clumsy, let me try again. this is one of the calling print lines. `if (open(THE_FILE, $prestring . "WWM.txt")){print "<center><img border +=0 src=/article_separator.png><br><B>WWM: Windmills and Wind Motors. +1910.</B></center><br>";fn1();} else {print "<center><img border=0 sr +c=/article_separator.png><br><b>Windmills and Wind Motors, 1910,</b> +has no ATOCI.</b></center><br>"};` [download] In the output, it does not render the center, the img, the line break, the bold. It does render the "WWM: Windmills and Wind Motors. 1910". Then no render of the nobold, the nocenter, the line break. It does render from the function print line html tags. Back to testing... And thanks for responding.	[reply] [d/l]
Re^3: losing html in print calling function. by LesleyB (Friar) on Nov 21, 2011 at 16:24 UTC
It does seem a strange bug. Have you checked the value of `$prestring`? I would not suspect `fn1` in this case.	[reply] [d/l] [select]
Re^3: losing html in print calling function. by LesleyB (Friar) on Nov 21, 2011 at 16:48 UTC
I took another look at your page and checked the source again. Where a list of results occurs, e.g search for apple and look at the data found in "FCOA: Farmers Cyclopedia of Agriculture. 1911"; there is malformed HTML <center><img border=0 src=/article_separator.png><br><B>FCOA: Farmers +Cyclopedia of Agriculture. 1911.</b></center><br><center><TR><td>APPL +E�Root-grafting, root vs. top-grafting, location and soil, pla +nting the trees, cultivation and cover crops, manuring, pruning, harv +esting and storage, utilization of waste apples, varieties, general s +cheme for spraying apple trees, enemies</td><td> 216</td></tr> <br><center><TR><td>PINEAPPLE�Locations, propagation, fertilize +rs, growing under sheds, varieties, enemies</td><td> 287</td></tr> <br><center>Apples, 216<Br> <br><center>Apples, Aphis, 228<Br> <br><center>Apples, Bitter Rot, 223<Br> <br><center>Apples, Borers, 226, 270<Br> <br><center>Apples, Brown Rot, 224, 275<Br> <br><center>Apples, Bud Moth, 227<Br>Rot, 223<Br> [download] with some table markup being thrown in and no closing center tag on the record data returned from the file. Your initial file title is there `<center><img border=0 src=/article_separator.png><br><B>FCOA: Farmers +Cyclopedia of Agriculture. 1911.</b></center><br>` [download] but I would assume the sudden appearance of table markup followed by unterminated center tags isn't helping your output. In this case I would say the error is probably in your original `fn1` function where you print the record out and the table markup is in the data you pick up. Neither of these is a Perl problem although they can be solved with Perl If I haven't identified your problem, would you please post a snippet of HTML that shows the problem?	[reply] [d/l] [select]
Re: losing html in print calling function. by Util (Priest) on Nov 21, 2011 at 17:27 UTC
Here are a few more pointers: LesleyB++ ; I second that templating advice. You are using the search string directly as a regular expression, which is probably not what you mean to do. For example, if I search for "a..le", I get the hits for "apple", "battle", and more. As I demonstrate below, the quotemeta function can be used to escape special characters in the string, forcing it to just be a simple test search. Since your HTML form is (appropriately) using GET instead of POST, you can easily split your testing into two pieces from the commandline: curl -o test_results.html 'http://www.librum.us/cgi-bin/satoci.cgi?sitem=apple' open test_results.html (or just `test_results.html`, or double-click on the file; however you open a document on your system) You can inspect test_results.html in a editor, to look for problems with the HTML, or make sure that what your HTTP server sent is what your `curl` client recieved, or to change the HTML by hand to see what effect such changes make to the rendered output. This is what allowed me to quickly find your CENTER tag problem. You are using lots of repeated code, such as many calls to `open` and `fn1()`. The DRY principle will guide you to separate the parts that differ in each copy from the parts that are identical across all the copies. The Perl docs on Array of Arrays can help you understand how to manipulate the data that is extracted from each of the copies, such that you have a single call to `open` and and a single call to `fn1()`, but within a loop that walks over a list of the books. This will also clean up many inconsistencies in your output, such as "Youngs.htm Missing" vs "YCMI file not found" vs "has no ATOCI". To explore your original problem, I refactored your code into the style that I would write (except that I would go further to use some template module). I include it below, in case anyone finds it useful. #!/usr/bin/env perl use strict; use warnings; use Carp; my $prestring = '???'; $prestring = '.'; # Overridden for PerlMonks testing my $wanted_string = 'apple'; # really take from param 'sitem'; my $index_dir = $prestring; my @book_indexes_titles = ( [ '1800.txt' , '1800: Mechanical Movements, Powers and Devices, +1911' ], [ '507.txt' , '507: Five Hundred and Seven Mechanical Movements +. 1893' ], [ '970.txt' , '970: Mechanical Appliances, Mechanical Movements + and Novelties of Construction. 1904' ], [ 'ACAM.txt' , 'ACAM: Appletons Cyclopedia of Applied Mechanics, + 1880 ' ], [ '???.txt' , 'Audels Carpenters and Builders Guide I' ], [ '???.txt' , 'Audels Carpenters and Builders Guide II' ], [ '???.txt' , 'Audels Carpenters and Builders Guide III' ], [ '???.txt' , 'Audels Carpenters and Builders Guide IV' ], [ 'AMOM.txt' , 'AMOM: A Manual of Mending and Repairing. 1907' ] +, [ '???.txt' , 'Farm Blacksmithing (1901)' ], [ 'FCOA.txt' , 'FCOA: Farmers Cyclopedia of Agriculture. 1911' ] +, [ 'FCOLS.txt' , 'FCOLS: Farmers Cyclopedia of Live Stock. 1908' ] +, [ 'FDAD.txt' , 'FDAD: Furniture Design and Draughting. 1900' ], [ '???.txt' , 'ICS Reference 14' ], [ '???.txt' , 'Pottery for /artists, Craftsmen, and Teachers (1 +930)' ], [ '???.txt' , 'Practical Poultry Production (1910)' ], [ 'TJCB.txt' , 'TJCB: Thomas Jeffersons Cook Book. 1937' ], [ 'TMOL.txt' , 'TMOL: The Making of Leather. 1914' ], [ 'TPSD.txt' , 'TPSD: The Practical Stock Doctor. 1912' ], [ 'TSC.txt' , 'TSC: The Settlement Cookbook. 1903' ], [ 'TSI.txt' , 'TSI: The Shoe Industry. 1916' ], [ '???.txt' , 'Univeral Household Assistant (1884)' ], [ 'VIRG.txt' , 'VIRG: Virginia Housewife or Methodical Cook. 189 +9' ], [ 'WBH.txt' , 'WBH: Window Blinds. 1907. (Hasluck)' ], [ '???.txt' , 'Wood Finishing, 1903' ], [ 'WHCB.txt' , 'WHCB: White House Cook Book. 1887' ], [ 'WWM.txt' , 'Windmills and Wind Motors, 1910' ], [ 'YCMI.htm' , 'YCMI: You Can Make It. 1929-1931' ], [ 'YOUNGS.htm' , 'YOUNGS: Youngs Demonstrative Translation of Scie +ntific Secrets. 1861' ], ); sub search_the_file { croak("Wrong number of arguments") if @_ != 2; my ( $fh, $search_string ) = @_; my $search_qm = quotemeta $search_string; my $search_re = qr{$search_qm}i; # i for insensitive-to-case my @found; while (<$fh>) { chomp; push @found, $_ if uc($_) =~ /$search_re/; } return @found; } my $separator_line = '<center><img border="0" src="/article_separator.png"></center>< +br>' . "\n"; $separator_line = "---\n"; # Overridden for PerlMonks testing my $count_total = 0; BOOK: for my $book_aref (@book_indexes_titles) { my ( $index_file, $book_title ) = @{$book_aref}; print $separator_line; open my $fh, '<', "$index_dir/$index_file" or do { print "<center>ATOCI not found for book <b>$book_title</b></ce +nter><br>\n"; next BOOK; }; my @lines = search_the_file( $fh, $wanted_string ); close $fh; $count_total += scalar @lines; print "<center><b>$book_title</b></center><br>\n"; for my $line (@lines) { print "<center>$line</center><br>\n"; } } print $separator_line; print "$count_total records found.<br>\n"; print $separator_line; [download]	[reply] [d/l] [select]
Re^2: losing html in print calling function. by Librum (Initiate) on Dec 03, 2011 at 03:53 UTC
Thanks for all the responses. All good points. And a found another, and pass it on for what it may be worth. I had two bang statements, one for localhost and one for server, with the unused one double hashed to comment out. The double hash does not comment it out. It works fine now, in localhost. Now I am awaiting the host provider to fix a configuration problem and that should be that. Thanks again.	[reply]
Re: losing html in print calling function. by Util (Priest) on Nov 21, 2011 at 16:44 UTC
For other monks: The rendering problem shows up for me in Firefox, but not Opera or Safari. Your HTML looks like this: `<br><center>Apples, 216<Br> <br><center>Apples, Aphis, 228<Br> <br><center>Apples, Bitter Rot, 223<Br> <br><center>Apples, Borers, 226, 270<Br> <br><center>Apples, Brown Rot, 224, 275<Br>` [download] In HTML, the `BR` tag, like an `IMG` tag, stands alone; no closing `/BR` is needed, or even allowed. The `CENTER` tag, like most tags in HTML, is paired, and requires a closing `/CENTER` tag. By using so many (over 256?) `CENTER` tags without closing them, you are creating so many nested levels for the browser to keep track of, that it gives up trying. When I changed your HTML to close (or "balance") the tags, it renders properly: `<center>Apples, 216</center><br> <center>Apples, Aphis, 228</center><br> <center>Apples, Bitter Rot, 223</center><br> <center>Apples, Borers, 226, 270</center><br> <center>Apples, Brown Rot, 224, 275</center><br>` [download]	[reply] [d/l] [select]


Perl-Sensitive Sunglasses
	PerlMonks