This isn't an item of choice for me. I am getting badly formated HTML and I have to do certain tasks with it. The code may come from "popular" edits and if it displays in a browser when I get it, it has to display the same way when it leaves. What specificly are you referring to in the HTML in my post so I might better explain whether it is the problem or simply my mistake, but if you are referring to the misplaced center tag, that is unfortunately exactly as it is in one of the documents I am working with.
| [reply] |
if it displays in a browser when I get it, it has to display the same way when it leaves
Then you don't want to use HTML::TreeBuilder on it if it's not interpreting your bad html correctly. Maybe if you expanded on what sections of the html you are allowing the user to change we can help with a solution.
| [reply] |
It allows for editing meta tags, title tag, title attribute in anchor tags, and alt attributes in image tags. That part works fine and HTML::TreeBuidler makes it fairly simple. The problem lies in the reconstruction of pages with non compliant HTML.
I have looked at HTML::Parser and HTML::TokeParser, but they don't do the heavy lifting that HTML::TreeBuilder does for me. I am trying to avoid reinventing the wheel by using HTML::TreeBuilder. According to the change log HTML::TreeBuilder there was a known bug with handling out of place information inside of table, but doubt if they invisioned anything as bad as non table tags outside of th, tr, caption, td, etc. tags within a table.
| [reply] |