Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Encode is "core" since perl 5.7.3 and handles pretty much any encoding you can think of. You should use it.

To write a file with a specific encoding, it's enough to do things like this:

open (my $file_handle, '>:encoding(UTF-8)', $filename) or die $!; print $file_handle $text_string; close $file_handle;

(Update: I didn't see you already used that, sorry for the noise.)

Guessing character encodings can be done with Encode::Guess, but it can never be done reliably.

I know of no module that combines encoding guessing with file slurping, so it might be worth the effort. But don't roll any encoding handling code by hand, it's all been done before and properly tested.

See perluniintro, perlunitut and perlunicode for details, I also wrote a short article on the subject.

(Update: Fixed article link, thanks for reporting Rudif.

In reply to Re: Module to read - modify - write text files in any unicode encoding by moritz
in thread Module to read - modify - write text files in any unicode encoding by Rudif

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2023-12-06 17:42 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (31 votes). Check out past polls.