Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Pyuuta: Programming in Japanese

by miyagawa (Chaplain)
on Oct 10, 2001 at 09:56 UTC ( [id://117938]=perlmeditation: print w/replies, xml ) Need Help??

Pyuuta has been a well-known(?) Japanese programming environment for Basic. It means you can write Basic code in natural Japanese language.

So here is Pyuuta for Perl. Filter::Pyuuta <A href="http://bulknews.net/lib/archives/Filter-Pyuuta-0.01.tar.gz">tarball</A>

You can write Perl5 in Japanese. How effective, if you're Japanese!

--
Tatsuhiko Miyagawa
miyagawa@cpan.org

Replies are listed 'Best First'.
Re (tilly) 1: Pyuuta: Programming in Japanese
by tilly (Archbishop) on Oct 10, 2001 at 19:07 UTC
    Nice, but the regular expression engine still won't understand Kanji.

    Personally if I spoke Japanese and needed to process Japanese text (which I admittedly do not), I would be inclined to use Ruby. It is a scripting language whose regular expression engine does understand Kanji. (What character set it uses is configurable.)

      90 percent of my programming directly involves processing of Japanese (kanji,kana,alphabetic) strings and I have to admit I was thinking of deserting the flag and do more stuff in Ruby which has some appeal to Perl programmers, I pressume.

      While for one-offs I sometimes use JPerl especially to do some tr /a-n/A-N/ (read a-n as Hiragana and A-N as Katakana) I almost exlusively use standard Perl. Of course, regular expressions will work with kanji (read: Shift JIS, euc-jp) but it is a kind of more complicated to implement and debbug them.

      Definitly, Perl is not the best (=easy to learn, easy to maintain scripts) text processing language if you do a lot of Japanese information processing. For some Ruby or JPerl may be a good alternative to Perl.

      Why do I use Perl? -- Because its is well documented (free manpages, free websites, excellent dead tree books), clpm and Perlmonks, its hard to tell your clients you want to deliver Ruby applications but easy to say Perl is necessary, and while there is a RAA (Ruby Application Archive) cpan is just unbeatable.

      I am happy with Perl, and I will be much more happy when Unicode will become a widely used standard. At the moment almost all my files are in sjis, euc-jp or jis. Roundtrip conversion from euc to Unicode and after processing back to euc cost just too much time to allow me using the nice Unicode features for easy text processing.

      Hanamaki

      Ruby seems really cool, but in some places their support base is so anti-Perl, I kinda hesitate to learn it :-) ( Actually, I would start learning once the american o'reilly comes up with a good book to read... I find Japanese technical books to be harder to understand )

      Prsonally, I haven't had much problem with using regexp on Japanese characters. Of course, the approach I take is

      • Convert to euc
      • write down the expression that I want to use
      • use unpack( "H*", $string ) to find the byte values for the Japanese portion of my regex
      • use the byte values to match

      Yes, it's kind of annoying, and yes, it's hackish approach, but it works for me.

      Update: posted code

        is so anti-Perl, I kinda hesitate to learn it

        Well, I'm coming from perl and I feel fine with coding in ruby. And it's not really andit-Perl, they took a good load of the good perl stuff :)

        once the american o'reilly comes up with a good book to read...

        I took "Programming Ruby" by David Thomas and Andrew Hunt (which is from Addison Wesley) and it gave me a very good start. The only problem I got is the small code base to look at...

        Regards... Stefan
        you begin bashing the string with a +42 regexp of confusion

        A problem with your approach.

        Kanji is a multi-byte character set. It is possible for Perl to find a match starting in between the characters you are looking for. With long strings it is not likely, but still it is possible and confusing if you do.

        As for Ruby, this book is quite good. And yes, there are morons who like Ruby and hate Perl. But my experience was that the core Ruby people (people like Matz and Dave Thomas) by and large didn't share that attitude.

        My personal take on Ruby is that it is an interesting language. I am glad I learned it. I think it is more cleanly structured than Perl, it is more cleanly extensible and I believe that I could more rapidly bring someone up to speed on Ruby than Perl. However it does not have Perl's broad application support, it lacks CPAN, you will have to train people, and I didn't find it compelling enough to recode an existing application base. The single biggest "Uh, oh" for me is that it doesn't have an equivalent to strict.pm.

        However learning Ruby made me see and understand certain aspects of Perl better, so even if I never use it, I still think it was a good thing to do.

        For some short introductions on how to use Regular Expressions with multibyte character sets I would like to recommend Ken Lunde's excellent papers on this topic. Have a look at all the pdf files you will find in the Perl ftp directory for the bookCJKV Information Processing.

        Hanamaki
        Sorry I don't get it. How is your approach supposed to work? If you did not forget mentioning one or two important steps, you have definetly big problems using regexps on euc strings. How do you anchor your string and keep in sync (= How do you know your Byte is the only, first or second byte of a character?)
        But maybe I missunderstood you, and it would be nice to see an example.

        Hanamaki
Re: Pyuuta: Programming in Japanese
by lestrrat (Deacon) on Oct 10, 2001 at 19:20 UTC

    Cool stuff! It would have been cooler if you had more documentation :-)

    In a related topic, though, I definitely feel that the multibyte languages suffer from a significant disadvantage when it comes to using it as typing intensive tasks like programming

    I realize this module is not intended to be a real module that everybody would be using, but I personally would never use a language that require conversion from hiragana to kanji, katakana, or what have you for programming. It takes way too long to type!

    ( By the way, I seem to remember somebody coming up with a programming language whose programming interface was completely in Japanese... anybody remember what that was called? )

    Of course, I also tend to find that broken up japanese like that is just so ugly.... It takes away the beauty of the language, I think. But that's just me :-)

Re: Pyuuta: Programming in Japanese
by cacharbe (Curate) on Oct 10, 2001 at 17:09 UTC
    segoi no, yo!

    I don't know that I could handle actually programming in nihon go, but don't think I'm not going to try now.

    Very interesting.

    C-.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://117938]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (3)
As of 2024-04-25 06:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found