Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Locale::Maketext Lexicon Opinions

by jk2addict (Chaplain)
on Jun 04, 2003 at 15:30 UTC ( #263023=perlquestion: print w/replies, xml ) Need Help??

jk2addict has asked for the wisdom of the Perl Monks concerning the following question:

I'm currently working on a new app and decided that just for giggles, I would start out with localization in mind, if for no other reason than because-I-can. :-)

With a quick scan of Locale::Maketext I created MyApp::L10N and MyApp::L10N::en_us files. Everything works fine. Happy happy.

Now back to my question. In everyone's opinion, which is the most preferred way to do your %Lexicon?

I. Use primary language as the keys and rely on the _AUTO to do what you want until we have things filled in.

package MyApp::L10N::en_us; ... %Lexicon = ( 'Item Not Found' => 'Item Not Found' ); package MyApp::L10N::jibberish; ... %Lexicon = ( 'Item Not Found' => 'sdfrty dfr5 ffgdfg' );

II. Use more constant like key names and create your entire primary language lexicon from the start?

package MyApp::L10N::en_us; ... %Lexicon = ( 'ITEM_NOT_FOUND' => 'Item Not Found' ); package MyApp::L10N::jibberish; ... %Lexicon = ( 'ITEM_NOT_FOUND' => 'sdfrty dfr5 ffgdfg' );

I'm torn about which method to use really. On the one hand, making my lexicon keys the primary language statements (I.) makes sense. I can delay the completion of my primary lexicon until I'm finished, or allow _AUTO to do the right thing. But the programmer in me screams that I shouldn't force other language lexicons to have the english version as their key.

Consider this: I mispell my english lexicon key "Ite Not Found". Now every seperate language lexicon has to alter it's key to correct the spelling mistake. That's more work than needed if we just created the key as ITEM_NOT_FOUND instead. Sure, I could mispell my keys here also, but it's more likely I'll get my key names correct, than entire sentences in english sometimes. :-)

I've seen both ways touted, usually the first method more than the latter. I just wanted to see what the prevailing practice seems to be.


Replies are listed 'Best First'.
Re: Locale::Maketext Lexicon Opinions
by december (Pilgrim) on Jun 04, 2003 at 17:22 UTC

    I don't know what the 'prevailing practice' would be here, but I use the latter method, a simple, short key name, referring to the function of the (error)message. This makes it much more easy to change the actual error message, even in English, because you have all values in one hash that can be revised, clarified, corrected and spell-checked easily later on when real people (as opposed to programmer-aliens) have to be able to make sense out of the error message. Not to mention that some error messages would make terribly long (and error-prone) identifiers. Imagine you want to be more specific about 'Item Not Found', you could end up with keys like 'Proc DoSomething: Item Not Found: Empty Record: End Of File?'. What are the odds you match that exact string without double-checking and/or copy-n-paste?...

    No, my money is on very short, conceptual keys that can make you understand with one or two concatenated words where in your program's code things went wrong, like e.g. eofDatabase, paramFileName, validateFirstName, etc. It doesn't take a genius to find out what those error keys refer to, and (at least to me) they seem very hard to spell wrong.

    Hope this makes sense,


    PS: Nice to see some (voluntary) use of internationalization.

Re: Locale::Maketext Lexicon Opinions
by TomDLux (Vicar) on Jun 04, 2003 at 17:22 UTC

    Short, succint constant names are short, easier to get right, more meaningful. Otherwise, don't sweat it.

    What if you mispell your constant name, ITE_NOT_FOUND, and since it's only used once, or you cut-and-paste the two or three places it is used, you don't detect the typo? Maybe you should spell-check your document.

    What language are your constants? Other than capitalization and s/ /_/g, they are identical to the 'English' constants.

    It's kind of you to be concerned about others, but non-programming users don't care about what's inside your program. To a Korean grandmother, the text of your script is as comprehensible as the binary of a compiled program. As far as she's concerned, neither uses the alphabet ... our alphabet being meaningless squiggles to her, just as her's is to us. As for non-English programmers, they are already forcecd to use English, whether they use C, Java, Python or Perl: if, open, use.

    As a Canadian, I get peeved by US-oriented software standards that make me mispell words. When I use JavaScript or generate HTML in code, I sometimes write things like:

    <body bgcolor="$bgcolour">

    Luckily, some weird bug will usually distrtact me before I startt fuming.

      Maybe you should spell-check your document.

      I think trying to spell check source code is cruel and unusual punishment. :-)

Re: Locale::Maketext Lexicon Opinions
by duelafn (Parson) on Jun 04, 2003 at 18:22 UTC

    I don't know what the prevailing practice is as I just started on the multilingual programming trail as well. I however, decided to use full statements as the keys for three reasons,

    • I can use _AUTO.
    • While developing, I'm likely to change my messages frequently
    • Full English (in my case) statements carry more context and tell the translator more about how the phrase should be worded without having to resort to reading the code, or involking that part of the program to see what the key means.

    The third reason is probably the "best" reason, though the first two were the ones that sold me (since I'm also doing it "for fun").

    Good Day,

    If we didn't reinvent the wheel, we wouldn't have rollerblades.

Re: Locale::Maketext Lexicon Opinions
by richyboy (Acolyte) on Jun 05, 2003 at 10:54 UTC
    I think it depends on how ambitious your projects are.

    If they're small scale and are unlikely to get used by the wider community, use whatever way you're happier with.

    But if you're looking to get projects used by the wider community, you need to make it as easy as possible for other programmers and translators to work on your codebase.

    This means going with I, and I would seriously consider using Locale::Maketext::Lexicon.

    When writing your app, use localization function names (typically __ or x) to mark all strings that need i18n. You then run to extract these strings to a .pot (Portable Object Template) file and that's the file used for creating .po files containing the translated strings.

    Why all this hastle? PO files are pretty much the universal standard for translation files, and are used by most programming languages. There are many free and commercial tools for editing such files - check out kbabel if you run linux, and read the kbabel help to get a good introduction to how the whole thing works.

    If you go with this suggestion, even if translators know little or nothing about your code, they still have the .pot file and can translate this without problem...almost!

    Say you go with II, and have the key FILE_ERROR, what can the translator do with that?

    Not a lot really, but if your key is:

    "A file write error has occurred while trying to save %1. Please check that device %2 is available and has enough free space to complete this action"
    The translator knows exactly what you mean and can provide a good, accurate translation without having to look at the source code or contact you directly, and is far more likely to do the work rather than say stuff it, it's too much hastle!

    Anyway, it's great to hear that you're going to i18n your app - I really hope perl coders start to consider this more often - modern open source apps that ignore i18n are doomed to fail, and rightly so IMHO.

    Cheers, and hope this helps,

Re: Locale::Maketext Lexicon Opinions
by hatter (Pilgrim) on Jun 05, 2003 at 10:46 UTC
    I use localisation not just for error messages, but also for bodies of text (primarily for use on a website) So I use the same policy for both types of text - short, meaningful, programmer-oriented names. invalid_email should be clearer to someone whose first language isn't english that email_address_typed_incorrectly, which might be what you tell the english user, but isn't technically spot-on.

    the hatter

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://263023]
Approved by Thelonius
Front-paged by Thelonius
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2022-05-28 14:04 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (99 votes). Check out past polls.