Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^3: Japanese character in Linux

by Corion (Patriarch)
on Jul 07, 2011 at 14:20 UTC ( [id://913197]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Japanese character in Linux
in thread Japanese character in Linux

You need to check five things:

  1. In what encoding is the data stored in the Sybase database?
  2. Does your script Encode::decode the data from the proper encoding?
  3. In what encoding is the data stored in the Oracle database?
  4. Does your script Encode::encode the data to the proper encoding?
  5. Does your script output to the console in the encoding that the console uses?

Replies are listed 'Best First'.
Re^4: Japanese character in Linux
by prafulltc (Acolyte) on Jul 08, 2011 at 06:04 UTC
    In Sybase Japanese data columns are encoded in Shift-JIS encoding.
    
    We are retrieving this data using DBI.
    use DBI qw(:sql_types);
    
     if ( @row = $dbFOX_sth->fetchrow_array ) {
                     ( $sInstrumentNameJ, $sInstrumentShortJ )    = @row;
    
    When we print this data in unix console it comes as junk.
    
    After we get this value in a variable we pass this to a stored proc which inserts data in Oracle Nvarchar2 data type field.
    
    Here it comes as inverted ?.
    
    Please advise.
    
    
    
    

      I guess, we'll have to go step by step. First, add "use Encode;". After you've obtained the values from DB, check if they are converted to internal perl encoding using

      print Encode::is_utf8($sInstrumentNameJ), "\n";
      If this produces "1", then the value is converted to perl's internal form and we should check how you output it to the terminal. If this produces empty string, then the value is not converted by the driver. In this case you have to convert it manually.

      In either case, we have to know which locale is active in your terminal emulator. Normally, it shall be some UTF-8 locale, but who knows. Please provide output of "locale" command.

      Also, if the "is_utf8" function produces empty string, it would be good to provide here the hexdump of the value you get from the database. Using this way for example

      print unpack("H*", $sInstrumentNameJ), "\n";
      And also the Japanese text it should correspond to.

        is_urt8 is returning nothing and output of unpack is 81698a94816a8bc9976d.
        
        We tried one more function find_encoding of Encode module and output is Encode::XS=SCALAR(0xaad27d0).
        
        W are not getting actual output when we use encode and decode fucntins.
        
        Output of locale command is as below
        LANG=C
        
        LC_CTYPE="C"
        
        LC_NUMERIC="C"
        
        LC_TIME="C"
        
        LC_COLLATE="C"
        
        LC_MONETARY="C"
        
        LC_MESSAGES="C"
        
        LC_PAPER="C"
        
        LC_NAME="C"
        
        LC_ADDRESS="C"
        
        LC_TELEPHONE="C"
        
        LC_MEASUREMENT="C"
        
        LC_IDENTIFICATION="C"
        
        LC_ALL=
        
        Please sugegst where we are going wrong.
        
        Thanx,
        Prafull
        

      Please see points 2 to 5 of my reply.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://913197]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-04-24 02:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found