Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: CPAN's URI.pm versus Japanse as Unicode?

by mldvx4 (Friar)
on Dec 11, 2022 at 12:21 UTC ( [id://11148735] : note . print w/replies, xml ) Need Help??


in reply to Re: CPAN's URI.pm versus Japanse as Unicode?
in thread CPAN's URI.pm versus Japanese as Unicode?

Thanks, though adding use utf8 does not affect the result perhaps I need to convert from Punycode. Is there a module for converting from Punycode to Unicode? Working with the host names as Punycode is not really an option, as far as a I can tell, because the host name needs to remain human-readable.

The goal is to extract the host name from the URI and the host name happens to be Japanese as Unicode, as is wont to happen.

Replies are listed 'Best First'.
Re^3: CPAN's URI.pm versus Japanse as Unicode?
by haukex (Archbishop) on Dec 11, 2022 at 12:50 UTC
    Thanks, though adding use utf8 does not affect the result

    Yes, it does.

    ... the host name needs to remain human-readable. The goal is to extract the host name from the URI and the host name happens to be Japanese as Unicode, ...

    Corion already pointed you to Net::IDN::Encode as one possibility.

    use warnings;
    use strict;
    use utf8;
    use open qw/:std :encoding(UTF-8)/;
    use URI;
    use Net::IDN::Encode qw/domain_to_unicode/;
    
    my $href="https://マリウス.com/";
    my $uri = URI->new($href);
    my $domain = domain_to_unicode($uri->host);
    print $domain,"\n";  # prints "マリウス.com"