Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

DBD::pg considers input and output ISO8859

by apz (Initiate)
on Sep 25, 2018 at 18:22 UTC ( [id://1222990]=perlquestion: print w/replies, xml ) Need Help??

apz has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

Many moons ago I wrote a Perl script for data entry into PostgreSQL database over a web GUI. The system ran on Ubuntu server 14.04 and it never had any issues with encoding. What was posted from a web form had encoding that would make it readable from PostgreSQL's shell client.

The system however needs to be upgraded to 18.04 and here's where the issue arose. Now all the output I get from DBD:Pg is in ISO8859. The database itself is in UTF8, the server and client both report UTF8 as encoding. I can run queries interactively or from one shot shell commands and receive the data in correct encoding. But not if used from DBI.

I installed a fresh 18.04 installation to debug the issue and the problem persists. Here's an example of the issue:

First creation of a demo table:

demo=> create table demo (content varchar); CREATE TABLE demo=> \d demo Table "public.demo" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- content | character varying | | | demo=> insert into demo (content) values ('äöåÄÖÅ'); INSERT 0 1 demo=> insert into demo (content) values ('aaaaaa'); INSERT 0 1 demo=> select * from demo; content --------- äöåÄÖÅ aaaaaa (2 rows)

So far so good. If the last select query is redirected to a text file, examining it with a hex editor reveals it's UTF8, file-command agrees.

Next we try the same with DBI:

use DBI qw(:utils); $DBcon=DBI->connect("DBI:Pg:dbname=demo", "demo", "demo-pass"); $DBhandle=$DBcon->prepare("SELECT * FROM demo"); $DBhandle->execute(); if ($DBI::rows > 0) { print $_->{'content'}." - ".data_string_desc($_->{'content'})."\n" w +hile $_ = $DBhandle->fetchrow_hashref(); } $DBcon->disconnect();
Instead of replicating what came out in PostgreSQL's interactive shell, we get:
������ - UTF8 on, non-ASCII, + 6 characters 12 bytes aaaaaa - UTF8 on, ASCII, 6 characters 6 bytes

If redirected to a text file, hex editor shows that this isn't UTF8 and again file-command agrees. If I insert any data into the database with DBI, it will be entered as double-UTF8'd. The system has PostgreSQL 10, Perl 5.26.1, DBI 1.640 and DBD::pg 3.7.0.

So dear monks, what's going on?

Replies are listed 'Best First'.
Re: DBD::pg considers input and output ISO8859
by choroba (Cardinal) on Sep 25, 2018 at 19:17 UTC
    Where do you set the encoding of the output file handle?
    binmode *STDOUT, ':encoding(UTF-8)'; # or use open OUT => ':encoding(UTF-8)', ':std';

    You might also need to correctly handle the pg_enable_utf8 attribute (see DBD::Pg).

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1222990]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2024-04-24 19:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found