Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Webscraping ISO language codes with Mojo

by moritz (Cardinal)
on Mar 26, 2011 at 21:41 UTC ( [id://895699]=CUFP: print w/replies, xml ) Need Help??

So I needed a hash of ISO-639 2-letter language codes and their English names. A quick google search turned up this list, and instead of search for a plain text list I decided to use it directly:

use 5.012; use Mojo::UserAgent; use warnings; binmode STDOUT, ':encoding(UTF-8)'; my $ua = Mojo::UserAgent->new(); my $r = $ua->get('http://www.loc.gov/standards/iso639-2/php/code_list. +php'); for my $row ($r->res->dom('tr')->each) { my ($three, $two, $english_name, $french_name) = map $_->text, $row->find('td')->each; say " $two => '$english_name'," if length($two) == 2; } __END__ Output: aa => 'Afar', ab => 'Abkhazian', af => 'Afrikaans', ak => 'Akan', sq => 'Albanian', ...

Only after that I found that use Locales; Locales->new->code2language("qu") does too what I needed.

Thus is the life of the Perl developer: we reinvent stuff because it's just so easy to do.

Replies are listed 'Best First'.
Re: Webscraping ISO language codes with Mojo
by hossman (Prior) on Mar 27, 2011 at 17:49 UTC

    If you wanted a hash, why did you start with a google search?

    I would have started with a CPAN Search which after following a few links lead me to Locale::Language.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://895699]
Approved by Limbic~Region
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-20 12:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found