Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Problems with XML encoding

by Sombrerero_loco (Beadle)
on Dec 29, 2009 at 11:49 UTC ( [id://814736]=note: print w/replies, xml ) Need Help??


in reply to Re: Problems with XML encoding
in thread Problem with quotes, speciao characters and so on, reading a xml file

Hi. I dont really need to read it as an xml, because i only want to do some substitutions. This its the weird line as it is in the xml file:
<hwAssetUserField3 type="attrib">CENTRO DE APOYO INFORMáTICO </h +wAssetUserField3>
As you can see, in the xml file, it seems to be a valid format. I dont care about the encoding because im reading the file as a normal file, not as an xml file, it means, line by line, to do some "raw" operation and rewrite in another file. Thanks anyway

Replies are listed 'Best First'.
Re^3: Problems with XML encoding
by FalseVinylShrub (Chaplain) on Dec 29, 2009 at 12:14 UTC

    Hi

    Hmm in that case I think I misunderstood your problem. Though I still think you should use some XML technology ;-) if you are doing simple substitutions, could you do it using XSLT?

    However, perhaps your problem is not with XML representations but with reading Unicode in. Assuming you're using Perl v5.8-v5.10, how are you opening the file? You need to tell Perl the encoding - presumably UTF-8.

    You can do this in a number of ways:
    # use binmode on the filehandle open my $fh, '<', "file" or die "... $!"; binmode $fh, ':utf8'; # open $fh for reading UTF-8 open(my $fh, "<:encoding(UTF-8)", "file") or die "... $!"; # Use the open pragma to open all input files as UTF-8 # see http://perldoc.perl.org/open.html use open IN => ':utf8'; # or you can manually use ... $str = decode_utf8( $str ); # on each data item

    In your case, easiest to use binmode on the filehandle - at least to find out if this is the problem.

    There are many documents trying to explain unicode in Perl. I quite like this one. Be aware that unicode support and the surrounding issues have changed quite a lot with the versions. v5.6 is completely different to the above, for example.

    FalseVinylShrub

    Disclaimer: Please review and test code, and use at your own risk... If I answer a question, I would like to hear if and how you solved your problem.

Re^3: Problems with XML encoding
by Jenda (Abbot) on Dec 30, 2009 at 11:14 UTC

    No matter whether you want to extract data or do some transformations you should NOT attempt to do it without an XML parser. If XSLT looks incomprehensible to you (it does to me) and XML::LibXML::SAX as well, try for example XML::Twig or XML::Rules. Maybe one of them will make sense to you. There are examples on this site and elsewhere.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://814736]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-25 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found