Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

how to force HTML to display as text

by Summers_Azin_pr0n (Initiate)
on Jan 24, 2002 at 12:31 UTC ( #141173=perlquestion: print w/replies, xml ) Need Help??

Summers_Azin_pr0n has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: how to force HTML to display as text
by grep (Monsignor) on Jan 24, 2002 at 12:38 UTC
    Take a look at HTML::Parser it already does what you want to do. You'll be better off if you don't reinvent this very complicated wheel.

    If you really want to do a parser for academic reason I would recommend at least using Parse::RecDescent written by TheDamian this will get you though the tags correctly.

    Else you might just be looking for a text based browser like lynx it will run on *nix and M$ Win

    As for your post you can use <code> </code> tags to get you spacing and wrapping correct. I also lets other monks DL your code. There are several resources on Perlmonks that can help you out such as turnstep's home node.
    Enjoy Perlmonks

    UPDATE: as crazyinsomniac pointed out I would be remiss if I did not point out that some searches could have helped out finding some similar nodes like:
  • parsing HTML
  • Dump Text from HTML
    Search is your friend :)
    Some other nodes to read On asking for help and How to ReadTheFineManual

    grep
    grep> cd pub
    grep> more beer
Re (tilly) 1: how to force HTML to display as text
by tilly (Archbishop) on Jan 24, 2002 at 17:51 UTC
    If you want to just turn HTML into something that will display as plain text, then you don't really need to parse it. Instead just use the HTML::Entities module's encode_entities() function. (Included with recent Perls, type perldoc HTML::Entities at the command line, if that gives you documentation then you have it.) If you have it then you can use it as follows:

    use HTML::Entities; # Insert much code here my $escaped_html = encode_entities($raw_html); # Insert the rest of your program
    If you want to retain the formatting, then you have to work harder. The simplest solution is to just surround the included HTML with pre tags. Another common solution is to turn returns into br tags and leading spaces into &nbsp; tags. You can also use the textbox solution that was already given, but it only works with arbitrary HTML if you already have escaped it. (Otherwise nothing stops the HTML from having a closing textbox tag to mess you up.)
Re: how to force HTML to display as text
by JayBonci (Curate) on Jan 24, 2002 at 15:45 UTC
    Alternatively, if you want the entire webpage to not escape out your html, you can begin your CGI or mod_perl web page script with:
    print "Content-type: text/plain\n\n";
    This will fool your browser into thinking it opened a text file instead of an HTML file on your sever. I'm not sure if you wanted it elsewhere in another HTML page, but going through the trouble of unescaping it would be silly if you could merely do a little MIME magic, and have it all work out... this of course depends on your intended use of aforementioned file. Good luck!
Re: how to force HTML to display as text
by Chrisf (Friar) on Jan 24, 2002 at 16:04 UTC
    TAAFWWTDI - There's always a far worse way to do it...
    #!/usr/bin/perl -wT use strict; print "Content-type: text/html\n\n"; print <<HEADER; <html> <head> <title>My HTML</title> </head> <body> <p>Here's the HTML:</p> <textarea cols="50" rows="20"> HEADER open HTML, "<data.html" or die "Can't open data.html: $!"; while (<HTML>) { print $_; } close HTML; print <<FOOTER; </textarea> </body> </html> FOOTER

    That works in some browsers, doesn't work in others.

    p.s. Make sure to follow grep's suggestion about code tags ;-)

Re: how to force HTML to display as text
by Hero Zzyzzx (Curate) on Jan 24, 2002 at 23:59 UTC

    Or, since you're using CGI.pm already to parse your form variables (of course you are, right!?), you can just use its escapeHTML function, too:

    use CGI; my $q=CGI->new(); $file_loc="$sorc"; $whattoread = fopen($file_loc, "r"); $file_contents=fread($whattoread, filesize($file_loc)); fclose($whattoread); print $q->header(); print $q->pre($q->escapeHTML($file_contents));

    which allows you to display your HTML very reliably, without IE idiotically second-guessing your mime-types. Plus, you can wrap this in whatever real HTML you like.

    HTH

    -Any sufficiently advanced technology is
    indistinguishable from doubletalk.

Re: how to force HTML to display as text
by beebware (Pilgrim) on Jan 24, 2002 at 23:04 UTC
    Using:
    print "Content-type:text/plain\n\n";
    should work. However, some versions of Internet Explorer do try and be clever and might recognise a HTML file as such even with the text/plain mimetype header. It is a bug in Internet Explorer (and, in fact, can be exploited), but I don't think Microsoft is willing to do anything about it.
Re: how to force HTML to display as text
by chipmunk (Parson) on Jan 25, 2002 at 08:07 UTC
    A quick way to embed literal HTML inside an HTML document is with an xmp tag. I would have put an actual example here, but apparently PerlMonks doesn't like the xmp tag. Here's how you would use it:
    <xmp> HTML list example: <ul> <li>Thing One</li> <li>Thing Two</li> </ul> </xmp>

    Of course, if the literal HTML contains an XMP tag, you've got problems. Encoding the angle brackets and ampersands is the most robust solution.

      yes thank you all for your very insightfull responces but why the hell did my post get down voted? why not up?! comon I got 5 exp!!! I used to have 10! I WANT TO VOTE!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://141173]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2021-12-02 23:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (27 votes). Check out past polls.

    Notices?