Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

PDF::API2 / unicode characters

by PerlSci (Initiate)
on Feb 17, 2012 at 02:32 UTC ( [id://954373]=perlquestion: print w/replies, xml ) Need Help??

PerlSci has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm trying to print utf-8 characters (greek letters primarily) using PDF::API2's core fonts, and I'm having no luck (which seems strange, because the .pm's for Verdana and Georgia seem to have the characters I need listed...).

Here are the primary things I'm having trouble figuring out:

-Are utf-8 characters actually supported by the core fonts, or do I need to import additional fonts to get an extended character set?

-Where can I get additional fonts to import for use, if I need them (or, how are they built/configured, if I can't get them off the shelf/ where the heck do I find doc on this??)

-Is there any documentation on using utf-8 with PDF::API2, or any good documentation on PDF::API2 in general, anywhere out there? (Or, alternatively... is there a better PDF printing package you'd recommend? I've already done a lot of work with PDF::API2, but if there's truly something better...)

Thank you in advance for the help!

Andrea

Replies are listed 'Best First'.
Re: PDF::API2 / unicode characters
by LonelyPilgrim (Beadle) on Feb 17, 2012 at 16:32 UTC

    Hello Andrea,

    I am fairly a newb at using PDF::API2 -- I took a crack at trying to figure it out a few weeks ago, but decided it was more trouble than it was worth for what I needed -- but I saw your lonely post, and felt inclined to offer what help I could, being a lover of Unicode and Greek characters especially.

    To your questions:

    -- According to the PDF::API2::Resource::Font::CoreFont manpage, the core fonts they list (Times Roman, Verdana, etc.) *ought* to support UTF8, and the standard Greek characters.

    -- According to the PDF::API2 manpage, you should be able to import standard TrueType fonts.

    -- PDF::API seems to be the best interface for PDF that I've found. But I am, as I said, a newb.

    I just followed what appeared to be the procedure for assigning a Unicode font with PDF/API2/Resource/UniFont.pm. But it doesn't seem to be working. I can why you are frustrated! I need to get to class, but my interest is now piqued, and I'll try to work more on this later today. For what it's worth, and for anyone else who comes along to help, here's what I'm working with:

    #/usr/bin/perl use strict; use warnings 'all'; use PDF::API2; use PDF::API2::Resource::UniFont my $test = 'test.pdf'; # Create a blank PDF file my $pdf = PDF::API2->new(); # Add a blank page my $page = $pdf->page(); # Add a built-in font to the PDF my $font = $pdf->corefont('Times-Roman'); # Register a Unicode font my $ufont = PDF::API2::Resource::UniFont->new($pdf, {font => $font}); # Create a Unicode string (Greek) my @uhex = qw(03a7 03b1 03b9 03c1 03b5 0021); my @uchars = map (hex, @uhex); my $ustring = pack("U*", @uchars); # Add some text to the page my $text = $page->text(); $text->font($ufont, 20); $text->translate(80, 710); $text->text($ustring); # Save the PDF $pdf->saveas('test.pdf');

    All it puts on the page, presently, is the ASCII, non-Unicode exclamation point.

      I believe the core fonts (i.e. what you get via $pdf->corefont(...) ) simply don't work with Unicode (investigation of the sources and extensive playing around essentially confirmed my suspicion that the generated font descriptions are always classical "Type 1" with an 8-bit encoding vector).  I'd be happy to be proven wrong, though!

      OTOH, when you use an appropriate TTF font which has the required glyphs, things work just fine:

      #!/usr/bin/perl use strict; use warnings; use PDF::API2; # Create a blank PDF file my $pdf = PDF::API2->new(); # Add a blank page my $page = $pdf->page(); my $font = $pdf->ttfont('DejaVuSans.ttf'); my $ustring = "\x{03a7}\x{03b1}\x{03b9}\x{03c1}\x{03b5}!"; # Add some text to the page my $text = $page->text(); $text->font($font, 20); $text->translate(80, 710); $text->text($ustring); # Save the PDF $pdf->saveas('test.pdf');

      DejaVu is a free font with very good Unicode coverage (see the unicover.txt file that comes with the package for details).  And if you feel like designing your own glyphs, you can even download the font's FontForge source files...

      P.S.: older versions of PDF::API2 (up to the maintainer change with 2.016) shipped with the DejaVu fonts included, so you may already have them under PDF/API2/fonts/.   Otherwise, if you install them somewhere else, you might want to specify the full path to the respective .ttf file.

        It works! I think my (and perhaps Andrea's) mistake was assuming that the PDF::API2 "core fonts" were the same as the standard TrueType fonts of the same name -- in my case, the version of Times New Roman in the Windows Fonts folder. When I changed your font line in the script above to this:

        my $font = $pdf->ttfont('C:/Windows/Fonts/times.ttf');

        I got the appropriate Unicode text in my PDF, in Times New Roman. The standard TrueType versions of Times New Roman, Verdana, and other fonts that come with Windows and probably other systems do support at least the basic Greek character set. That was why I assumed that the core fonts "should" support Greek. Thanks for your help!

        Thanks, too, for the recommendation for the DejaVu fonts. They seem like nice ones! I am kind of a font hoarder, especially for useful Unicode ones. For the extended Greek character set, diacritics and such, I also like the New Athena fonts (one of which supports the inverted breve circumflex, as opposed to the tilde circumflex many use).

        http://www.fontspace.com/american-philological-association/new-athena-unicode

        Correction once again: New Athena is a nice one, but it's Gentium and GentiumAlt that have the inverted breve circumflex:

        http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=gentium

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://954373]
Approved by NetWallah
Front-paged by MidLifeXis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-25 16:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found