Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Having HTML::Parser problem

by pfaut (Priest)
on May 23, 2003 at 01:30 UTC ( #260301=note: print w/replies, xml ) Need Help??

in reply to Having HTML::Parser problem

From the HTML::Parser docs:

$p->unbroken_text( $bool )

By default, blocks of text are given to the text handler as soon as possible (but the parser makes sure to always break text at the boundary between whitespace and non-whitespace so single words and entities always can be decoded safely). This might create breaks that make it hard to do transformations on the text. When this attribute is enabled, blocks of text are always reported in one piece. This will delay the text event until the following (non-text) event has been recognized by the parser.

Note that the offset argspec will give you the offset of the first segment of text and length is the combined length of the segments. Since there might be ignored tags in between, these numbers can't be used to directly index in the original document file.

90% of every Perl application is already written.

Replies are listed 'Best First'.
Re: Re: Having HTML::Parser problem
by nysus (Vicar) on May 23, 2003 at 01:47 UTC
    That did the trick. I will have to bone up on this module because I certainly don't quite know how or why. Beautiful, thanks.

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop";
    $nysus = $PM . $MCF;
    Click here if you love Perl Monks

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://260301]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (2)
As of 2021-02-27 03:59 GMT
Find Nodes?
    Voting Booth?

    No recent polls found