http://qs321.pair.com?node_id=645176


in reply to Re^3: PDF::Template and character encodings
in thread PDF::Template and character encodings

I've updated the original node with a trimmed-down xml template. Essentially,  <VAR NAME='LAST_NAME'> is the problem; it contains the (misrendered) accented character.

Changing the encoding to iso8859-1 *does* fix the PDF_findfont error, but it doesn't fix the problem with the accented characters.

Replies are listed 'Best First'.
Re^5: PDF::Template and character encodings
by shmem (Chancellor) on Oct 16, 2007 at 15:13 UTC
    As per your OP, do you really get "~Aj", or is it per chance æ (which is a à á - a grave acute) ?

    If that is the case, you are getting utf-8 from your database - run that data through Encode. Alternatively, try using iso10646-1 (or utf8 without the hyphen).

    Using those fonts might fail since it seems likely that the strings coming from the database don't have the internal UTF8 flag set.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      re OP and problematic character: the top half of that 'pipe' character looks more like a dot, which is why i thought it was a 'j'. (it's supposed to be an a-accent, not a-grave)

      setting the pdf_encoding='utf8' also blows up with the same "can't find encoding" error.

      I'm reading up on Encode, though I'm not sure if I need and encode/decode sequence or a simpler transform.

        Erm, yes, it's an a-accent (or a-acute). Fixed in previous post.

        Looks definitely like utf-8 data not passed as such. Try

        use Encode qw(from_to); ... while($r = $sth->fetchrow_hashref()) { from_to($r->{$_},"utf8","latin1") for keys %$r; }

        or such, and try with iso8859-1. How did the iso10646-1 font work?

        I have no experience with pdflib and PDF::Template (is pdflib an external library?) and there might be more settings that interfere, e.g. what is your systems default charset? Is the charset of your shell the system charset, or does it differ?

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}