http://qs321.pair.com?node_id=11115272


in reply to Convert JSON to Perl and back with unicode

G'day bliako,

I'm assuming that by "long JSON" you're referring the JSON with most of the whitespace removed which, I agree, can be almost impossible to read. In order to make this more readable, you could use a formatter. There's several free ones available; I have "JSON Formatter and Validator bookmarked — I do use it a fair bit but mostly for the validation functionality.

If by "edit the Perl" you're talking about modifying the Perl data structure programmatically, you could use something along the following lines.

#!/usr/bin/env perl

use strict;
use warnings;
use autodie;
use utf8;

use JSON;

my $json_in = 'pm_11115241_uni_greek.json';
my $json_out = 'pm_11115241_uni_greek_edit.json';

_print_json_file($json_in);
my $json_text = read_json($json_in);
my $perl_ref = decode_json $json_text;
_print_perl_json($perl_ref);
edit_perl_json($perl_ref);
_print_perl_json($perl_ref);
write_json($perl_ref, $json_out);
_print_json_file($json_out);

sub read_json {
    my ($file) = @_;

    open my $fh, '<', $file;
    local $/;
    return <$fh>;
}

sub write_json {
    my ($perl, $file) = @_;

    my $json_text = JSON->new->pretty->encode($perl);

    open my $fh, '>:encoding(UTF-8)', $file;
    print $fh $json_text;
}

sub edit_perl_json {
    my ($perl) = @_;

    my $greek_key = 'ΙΚΛΜΝΞΟΠ';
    my $greek_val = 'ικλμνξοπ';

    $perl->{$greek_key} = $greek_val;
}

sub _print_json_file {
    my ($file) = @_;

    print "*** Contents of '$file' ***\n";

    system cat => $file;
}

sub _print_perl_json {
    my ($perl) = @_;

    print "*** Perl from JSON ***\n";

    use open OUT => qw{:encoding(UTF-8) :std};

    for (sort keys %$perl) {
        print $_, ' = ', $perl->{$_}, "\n";
    }
}

Here's a sample run:

$ ./pm_11115241_uni_json_perl.pl
*** Contents of 'pm_11115241_uni_greek.json' ***
{
    "ΑΒΓΔΕΖΗΘ" : "αβγδεζηθ"
}
*** Perl from JSON ***
ΑΒΓΔΕΖΗΘ = αβγδεζηθ
*** Perl from JSON ***
ΑΒΓΔΕΖΗΘ = αβγδεζηθ
ΙΚΛΜΝΞΟΠ = ικλμνξοπ
*** Contents of 'pm_11115241_uni_greek_edit.json' ***
{
   "ΑΒΓΔΕΖΗΘ" : "αβγδεζηθ",
   "ΙΚΛΜΝΞΟΠ" : "ικλμνξοπ"
}

Although <code> tags are generally preferred for code and output, when dealing with Unicode, <pre> tags will not convert your characters to HTML entities. For inline, as opposed to block, markup, I use <tt> tags for the same purpose.

— Ken