Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

how can i avoid double quotes in options of start tag

by phoenix007 (Sexton)
on Apr 23, 2019 at 10:22 UTC ( [id://1232898]=perlquestion: print w/replies, xml ) Need Help??

phoenix007 has asked for the wisdom of the Perl Monks concerning the following question:

How can I avoid double quotes from output of following script. Tried different options to disable this double quotes feature. Every time if there is option in start tag this script is putting it in double quotes

use HTML::TreeBuilder; my $row_html = '<!DOCTYPE html> <body> <p class=3D"MsoNormal">https://www.google.com<o:p></o:p></p> </body>'; my $html = HTML::TreeBuilder->new; $html->ignore_ignorable_whitespace(0); $html->no_space_compacting(1); $html->store_comments(1); $html->parse($row_html); # i will do some modifications to HTML here my $output_html = $html->as_HTML(undef,undef,{}); print $output_html;

Current Output Note the double quotes around value of class="":

<!DOCTYPE html> <html><head></head><body> <p class="3D"MsoNormal"">https://www.google.com<o:p></o:p></p> </body> </html>

I am expecting it without double quotes

<!DOCTYPE html> <body> <p class=3D"MsoNormal">https://www.google.com<o:p></o:p></p> </body>

Replies are listed 'Best First'.
Re: how can i avoid double quotes in options of start tag
by Corion (Patriarch) on Apr 23, 2019 at 11:12 UTC
    <p class=3D"MsoNormal">https://www.google.com</p>

    Your HTML is encoded as "quoted-printable". The appropriate steps are to first decode the HTML from Quoted-Printable, for example using MIME::QuotedPrint. Then, munge the HTML as appropriate. Afterwards, optionally re-encode the HTML as quoted-printable if your mail client does not do that already.

Re: how can i avoid double quotes in options of start tag
by LanX (Saint) on Apr 23, 2019 at 10:48 UTC
    I took a glimpse into the source of HTML::TreeBuilder and it seems to be hardcoded: $val = qq{"$val"};

    IMHO better try escaping the doublequotes

    "3D\"MsoNormal\""

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: how can i avoid double quotes in options of start tag
by marto (Cardinal) on Apr 23, 2019 at 10:48 UTC

    Curious, what is invalidating this HTML? 3DMsoNormal (not 3D"MsoNormal") is a class generated by Microsoft Office products.

      Yes, As a sample I have posted one line from large html generated by microsoft product

        But they produce a valid class name <p class=MsoNormal> derp !</p>.

        please provide real samples

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1232898]
Approved by Ratazong
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-04-25 17:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found