Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Keeping bad HTML bad

by fruiture (Curate)
on Aug 23, 2002 at 20:32 UTC ( [id://192441]=note: print w/replies, xml ) Need Help??


in reply to Keeping bad HTML bad

First of all: check the HTML in your Post ;)

Secondly: Simple, you cannot handle something as HTML which isn't HTML. Call me stubborn, but if someone enters such wrong (and deprecated) stuff he'll have to live with the consequences.

--
http://fruiture.de

Replies are listed 'Best First'.
Re: Re: Keeping bad HTML bad
by trs80 (Priest) on Aug 23, 2002 at 20:52 UTC
    This isn't an item of choice for me. I am getting badly formated HTML and I have to do certain tasks with it. The code may come from "popular" edits and if it displays in a browser when I get it, it has to display the same way when it leaves. What specificly are you referring to in the HTML in my post so I might better explain whether it is the problem or simply my mistake, but if you are referring to the misplaced center tag, that is unfortunately exactly as it is in one of the documents I am working with.
      if it displays in a browser when I get it, it has to display the same way when it leaves

      Then you don't want to use HTML::TreeBuilder on it if it's not interpreting your bad html correctly. Maybe if you expanded on what sections of the html you are allowing the user to change we can help with a solution.
        It allows for editing meta tags, title tag, title attribute in anchor tags, and alt attributes in image tags. That part works fine and HTML::TreeBuidler makes it fairly simple. The problem lies in the reconstruction of pages with non compliant HTML.

        I have looked at HTML::Parser and HTML::TokeParser, but they don't do the heavy lifting that HTML::TreeBuilder does for me. I am trying to avoid reinventing the wheel by using HTML::TreeBuilder. According to the change log HTML::TreeBuilder there was a known bug with handling out of place information inside of table, but doubt if they invisioned anything as bad as non table tags outside of th, tr, caption, td, etc. tags within a table.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://192441]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-18 01:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found