Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: read a file and insert closing tags if not present

by valavanp (Curate)
on Mar 29, 2007 at 07:32 UTC ( [id://607176]=note: print w/replies, xml ) Need Help??


in reply to Re: read a file and insert closing tags if not present
in thread read a file and insert closing tags if not present

Hi grandfather, This is the code which i tried.
require HTML::TokeParser; $p = HTML::TokeParser->new("output.xml") || die "Can't open: $!"; $p->empty_element_tags(1); open(FH, "output.xml"); print FH $p; close FH;
output.xml
<greeting class="simple">Hello, world!
The above file is a sample file which i tried to insert the closing tag for the greeting. Actually i have a file which contains 500 lines of text with tagging. for. example in that file i have a tag named <to> but it's not closed. I have to insert the closing tag. This is an example. Thanks for your suggestion.

Replies are listed 'Best First'.
Re^3: read a file and insert closing tags if not present
by GrandFather (Saint) on Mar 29, 2007 at 08:00 UTC

    HTML::TreeBuilder handles that simple case:

    use strict; use warnings; use HTML::TreeBuilder; my $sgml = <<SGML; <greeting class="simple">Hello, world! SGML my $root = HTML::TreeBuilder->new (); $root->ignore_unknown (0); $root->parse ($sgml); print $root->guts (0)->as_XML ();

    Prints:

    <greeting class="simple">Hello, world!</greeting>

    although I'd not guarantee it will accept everything a real SGML document may contain.


    DWIM is Perl's answer to Gödel
      Hi grandfather, Your solution is fine. But when i give like this extra tags have been inserted. how can i avoid this.
      use strict; use warnings; use HTML::TreeBuilder; my $sgml = <<SGML; <html> <greeting class="simple">Hello, world!<head>heading</head> </html> SGML my $root = HTML::TreeBuilder->new (); $root->ignore_unknown (0); $root->parse ($sgml); print $root->guts (0)->as_XML ();
      Thanks for your suggestion
Re^3: read a file and insert closing tags if not present
by f00li5h (Chaplain) on Mar 29, 2007 at 07:46 UTC

    You can guess sometimes, but there is no way of knowing where the right place for it is.

    in the example,<p> foo <p> bar, you can see where the </p>'s should go, because you can't nest p tags but if you have <span style="rly">Oh, rly<span style="ya">ya, rly there is no real way of knowing where the </span>'s should go, because they can legally be nested.

    You'll most likely have to write rules for how (and where) to end each tag, so that you don't mess the nesting of things (like finding your whole document in a <a href="foo"> or something)

    @_=qw; ask f00li5h to appear and remain for a moment of pretend better than a lifetime;;s;;@_[map hex,split'',B204316D8C2A4516DE];;y/05/os/&print;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://607176]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-28 15:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found