Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Postgres UTF-8 woes

by 1nickt (Abbot)
on May 08, 2019 at 15:58 UTC ( #1233460=perlquestion: print w/replies, xml ) Need Help??

1nickt has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I've run into an issue with sending UTF-8 to Postgres, using DBIx::Class and DBD::Pg.


INSERT INTO method_data ( method_id, params) VALUES ( ?, ? ) RETURNING method_data_id: '59807', '{"addressLine1":"堅尼地道105號","ms-correlationid":"D2559182-719E-11E9-9E71-271EACAC0E00","city":"灣仔","country":"HK","user":"ut-testadmin"}'
     -> bind_param for DBD::Pg::st (DBI::st=HASH(0x7f982408c7f0)~0x7f982408d2b8 1 59813 undef)
    <- bind_param= ( 1 ) 1 items at line 1891
    -> bind_param for DBD::Pg::st (DBI::st=HASH(0x7f982408c7f0)~0x7f982408d2b8 2 '{"addressLine1":"�.尼�.��..105�..","country":"HK","city":"�.��.","ms-correlationid":"019A09E2-71A3-11E9-971A-C122ACAC0E00","user":"ut-testadmin"}' undef)
Making the query shown by DBIC_TRACE in the CLI succeeds normally.

Is this a known problem? I couldn't find much on the interwebs about it. I don't see anything in the source for DBD::Pg concerning encoding: are the C libs doing something under the hood?

Thanks in advance.

The way forward always starts with a minimal test.

Replies are listed 'Best First'.
Re: Postgres UTF-8 woes
by choroba (Archbishop) on May 08, 2019 at 18:57 UTC
    Do you send decoded unicode or encoded (e.g. UTF-8)? Have you set client or server encoding?

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      Hi choroba, thank you for replying.

      Client and server encoding are both set to UTF-8:

      Activity=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access +privileges -----------+---------+----------+-------------+-------------+--------- +------------ Activity | ******* | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | ******* | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | ******* | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/***** +** + | | | | | *******= +CTc/******* template1 | ******* | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/***** +** + | | | | | *******= +CTc/******* (6 rows) Activity=# show client_encoding; client_encoding ----------------- UTF8 (1 row)
      The data is passed decoded from UTF-8, which I believe is correct. DBIx seems to handle it as expected (given the encodings specified in the schema), but the SQL then produced by DBD::Pg/DBI is garbled, as shown. Passing data encoded to UTF-8 to DBIx results in even more garbled content in the DB.

      The way forward always starts with a minimal test.
        What version of DBD::Pg do you use? From the Changes it seems 3.3.0 is the minimum version with reasonable UTF-8 support, and probably even 3.6.0.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Postgres UTF-8 woes
by erix (Parson) on May 08, 2019 at 21:48 UTC

    I don't really see through your problem but have you tried various settings for pg_enable_utf8 ? (There's some info in perldoc DBD::Pg).

    In a quick test with your data and DBI/DBD::Pg but without DBIx::Class, I needed $dbh->{pg_enable_utf8} = 0;

      Thanks erix, I'll try that. My understanding of pg_enable_utf8 was that it only affects data retrieved from the DB, and is not normally needed with Pg >3.0, but I did try setting it to 1 from the default -1. I'll see if 0 affects it.

      The way forward always starts with a minimal test.

        Thank you very much indeed for the tip Master erix. Your help solved my problem.

        The way forward always starts with a minimal test.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1233460]
Approved by LanX
Front-paged by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2020-08-14 23:35 GMT
Find Nodes?
    Voting Booth?
    Which rocket would you take to Mars?

    Results (76 votes). Check out past polls.