Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Regex to Array lookup question

by Marshall (Canon)
on Apr 07, 2020 at 03:20 UTC ( [id://11115149]=note: print w/replies, xml ) Need Help??


in reply to Regex to Array lookup question

I am having a hard time understanding the problem. You write, "no established list of all the possibilities". I looked at the JSON output of https://api.weather.gov/icons. That looks like the possibilities to me. I decoded the JSON and simplified that into a more straightforward translation table. I do not show the LWP code, but I guess you know how to do that.

In general for things like this, I have found that using the textual description on the website is better than "rolling your own". If say, you need a different text for tsra_sct than the website provides, perhaps you want "tsra_sct" and "tsra_hi" to translate into the same thing? Then we are into a different discussion about how to maintain such a thing. That is a significant ongoing hassle that I don't recommend.

This gets the textual defs of each abbreviation from the website and translates the first "word" argument of the last path in the URL to that textual definition. I translated each of the URL's you provided. Please explain what else you need...

#!/usr/bin/perl use strict; use warnings; use JSON::Parse 'parse_json'; my $json = do{local $/ = undef;<DATA>}; my $out = parse_json $json; my %xlated_abbrev; #simple abbreviation table => description foreach my $key (keys %{$out->{icons}}) #gen simple xlate table { $xlated_abbrev{$key} = $out->{icons}{$key}{description}; } my @urls = ( 'https://api.weather.gov/icons/land/day/tsra_sct,20/tsra_sct,40?size=m +edium', 'https://api.weather.gov/icons/land/day/rain_showers,30/tsra_hi,30?siz +e=medium', 'https://api.weather.gov/icons/land/night/rain_showers,30/rain_showers +?size=medium', 'https://api.weather.gov/icons/land/day/bkn?size=medium' ); foreach my $url (@urls) { my $last_path = (split('/',$url))[-1]; my ($abbrev_to_xlate) = $last_path =~ /^(\w+)/; print "URL = $url\n"; print " $abbrev_to_xlate => \'$xlated_abbrev{$abbrev_to_xlate}\'\ +n\n"; } =PRINTS: URL = https://api.weather.gov/icons/land/day/tsra_sct,20/tsra_sct,40?s +ize=medium tsra_sct => 'Thunderstorm (medium cloud cover)' URL = https://api.weather.gov/icons/land/day/rain_showers,30/tsra_hi,3 +0?size=medium tsra_hi => 'Thunderstorm (low cloud cover)' URL = https://api.weather.gov/icons/land/night/rain_showers,30/rain_sh +owers?size=medium rain_showers => 'Rain showers (high cloud cover)' URL = https://api.weather.gov/icons/land/day/bkn?size=medium bkn => 'Mostly cloudy' =cut #Data returned from: https://api.weather.gov/icons __DATA__ { "@context": [], "icons": { "skc": { "description": "Fair/clear" }, "few": { "description": "A few clouds" }, "sct": { "description": "Partly cloudy" }, "bkn": { "description": "Mostly cloudy" }, "ovc": { "description": "Overcast" }, "wind_skc": { "description": "Fair/clear and windy" }, "wind_few": { "description": "A few clouds and windy" }, "wind_sct": { "description": "Partly cloudy and windy" }, "wind_bkn": { "description": "Mostly cloudy and windy" }, "wind_ovc": { "description": "Overcast and windy" }, "snow": { "description": "Snow" }, "rain_snow": { "description": "Rain/snow" }, "rain_sleet": { "description": "Rain/sleet" }, "snow_sleet": { "description": "Rain/sleet" }, "fzra": { "description": "Freezing rain" }, "rain_fzra": { "description": "Rain/freezing rain" }, "snow_fzra": { "description": "Freezing rain/snow" }, "sleet": { "description": "Sleet" }, "rain": { "description": "Rain" }, "rain_showers": { "description": "Rain showers (high cloud cover)" }, "rain_showers_hi": { "description": "Rain showers (low cloud cover)" }, "tsra": { "description": "Thunderstorm (high cloud cover)" }, "tsra_sct": { "description": "Thunderstorm (medium cloud cover)" }, "tsra_hi": { "description": "Thunderstorm (low cloud cover)" }, "tornado": { "description": "Tornado" }, "hurricane": { "description": "Hurricane conditions" }, "tropical_storm": { "description": "Tropical storm conditions" }, "dust": { "description": "Dust" }, "smoke": { "description": "Smoke" }, "haze": { "description": "Haze" }, "hot": { "description": "Hot" }, "cold": { "description": "Cold" }, "blizzard": { "description": "Blizzard" }, "fog": { "description": "Fog/mist" } } }

Replies are listed 'Best First'.
Re^2: Regex to Array lookup question
by Aldebaran (Curate) on Apr 07, 2020 at 05:18 UTC
    This gets the textual defs of each abbreviation from the website and translates the first "word" argument of the last path in the URL to that textual definition. I translated each of the URL's you provided. Please explain what else you need...

    Shoot, Marshall, I want to replicate this interesting script, but I can't see any braces or underscores out of place. I did snip off the documentation to try to shake this error, but it remains unchanged:

    $ ./1.marshall.pl JSON error at line 110, byte 2639/2647: Unexpected character '_' parsi +ng initial state: expecting whitespace: 'n', '\r', '\t', ' ' at ./1.m +arshall.pl line 8, <DATA> line 1. $ cat 1.marshall.pl #!/usr/bin/perl -w use 5.016; use JSON::Parse 'parse_json'; my $json = do{local $/ = undef;<DATA>}; my $out = parse_json $json; my %xlated_abbrev; #simple abbreviation table => description foreach my $key (keys %{$out->{icons}}) #gen simple xlate table { $xlated_abbrev{$key} = $out->{icons}{$key}{description}; } my @urls = ( 'https://api.weather.gov/icons/land/day/tsra_sct,20/tsra_sct,40?size=m +edium', 'https://api.weather.gov/icons/land/day/rain_showers,30/tsra_hi,30?siz +e=medium', 'https://api.weather.gov/icons/land/night/rain_showers,30/rain_showers +?size=medium', 'https://api.weather.gov/icons/land/day/bkn?size=medium' ); foreach my $url (@urls) { my $last_path = (split('/',$url))[-1]; my ($abbrev_to_xlate) = $last_path =~ /^(\w+)/; print "URL = $url\n"; print " $abbrev_to_xlate => \'$xlated_abbrev{$abbrev_to_xlate}\'\ +n\n"; } __DATA__ { "@context": [], "icons": { "skc": { "description": "Fair/clear" }, "few": { "description": "A few clouds" }, "sct": { "description": "Partly cloudy" }, "bkn": { "description": "Mostly cloudy" }, "ovc": { "description": "Overcast" }, "wind_skc": { "description": "Fair/clear and windy" }, "wind_few": { "description": "A few clouds and windy" }, "wind_sct": { "description": "Partly cloudy and windy" }, "wind_bkn": { "description": "Mostly cloudy and windy" }, "wind_ovc": { "description": "Overcast and windy" }, "snow": { "description": "Snow" }, "rain_snow": { "description": "Rain/snow" }, "rain_sleet": { "description": "Rain/sleet" }, "snow_sleet": { "description": "Rain/sleet" }, "fzra": { "description": "Freezing rain" }, "rain_fzra": { "description": "Rain/freezing rain" }, "snow_fzra": { "description": "Freezing rain/snow" }, "sleet": { "description": "Sleet" }, "rain": { "description": "Rain" }, "rain_showers": { "description": "Rain showers (high cloud cover)" }, "rain_showers_hi": { "description": "Rain showers (low cloud cover)" }, "tsra": { "description": "Thunderstorm (high cloud cover)" }, "tsra_sct": { "description": "Thunderstorm (medium cloud cover)" }, "tsra_hi": { "description": "Thunderstorm (low cloud cover)" }, "tornado": { "description": "Tornado" }, "hurricane": { "description": "Hurricane conditions" }, "tropical_storm": { "description": "Tropical storm conditions" }, "dust": { "description": "Dust" }, "smoke": { "description": "Smoke" }, "haze": { "description": "Haze" }, "hot": { "description": "Hot" }, "cold": { "description": "Cold" }, "blizzard": { "description": "Blizzard" }, "fog": { "description": "Fog/mist" } } } __END__ $
      When using a __DATA__ segment, you can't use an __END__ segment. Delete that __END__ line that you added. This is the reason that I embedded the output as a Perldoc instead of attaching the output after an __END__ segment.

      Also, add "use strict;" to the code like I did. This will help you as you experiment with the code.

      Update: this error: "JSON error at line 110, byte 2639/2647: Unexpected character '_' parsing initial state: expecting whitespace: 'n', '\r', '\t', ' ' at ./1.marshall.pl line 8, <DATA> line 1." Is complaining about the first underscore in the __END__ line that you added. The result is invalid JSON syntax. Error messages are often hard to figure out.

        My mistake:

        $ ./1.marshall.pl URL = https://api.weather.gov/icons/land/day/tsra_sct,20/tsra_sct,40?s +ize=medium tsra_sct => 'Thunderstorm (medium cloud cover)' URL = https://api.weather.gov/icons/land/day/rain_showers,30/tsra_hi,3 +0?size=medium tsra_hi => 'Thunderstorm (low cloud cover)' URL = https://api.weather.gov/icons/land/night/rain_showers,30/rain_sh +owers?size=medium rain_showers => 'Rain showers (high cloud cover)' URL = https://api.weather.gov/icons/land/day/bkn?size=medium bkn => 'Mostly cloudy' $
Re^2: Regex to Array lookup question
by Aldebaran (Curate) on Apr 08, 2020 at 00:32 UTC
    I am having a hard time understanding the problem. You write, "no established list of all the possibilities".

    We didn't hear back from OP, but we can investigate the problem it suggests in our minds.

    I looked at the JSON output of https://api.weather.gov/icons. That looks like the possibilities to me. I decoded the JSON and simplified that into a more straightforward translation table. I do not show the LWP code, but I guess you know how to do that.

    Marshall's post broke this open for me in a way that I wanted to pursue in first the LWP direction and then with some means to see these things that we're talking about. I'm taking WWW::Mechanize::Chrome through its paces. I ended up actually being able to see these things: 5 WMC screenshots. This gets a bit verbose, so I'll use readmore tags:

    Anyways, I find using perl to access these APIs very interesting.

    Update: Cropped screenshots and typo fixed here.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11115149]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-26 05:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found