Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

LWP::UserAgent Alternatives for API data

by johnfl68 (Scribe)
on Jul 05, 2020 at 19:14 UTC ( #11118939=perlquestion: print w/replies, xml ) Need Help??

johnfl68 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, and as always thank you for help and support.

I have been using LWP::UserAgent with the NWS API using mirror to save API data for several locations once an hour.

Lately the API has been having issues (many other people having issues as well, not just me), but I think part of the issues I am having has to do with me using mirror, and occasionally I will get an older data file from NWS, but it will put a future date for the file date for some reason (July 27, 2020 for example), and then it does not update anymore because the file on the server is newer (but old data).

I am supposed to send User Agent with API requests to NWS. I could go back to just doing GetStore, but don't want to have to change to a UserAgent solution again at some point in the future if I get blocked for not sending it.

I have looked around and there are dozens of different ways to get and store a file from http, but some may be better suited for this case then others. So before I recode this to do some other get store type option instead of doing a mirror type option (I don't really need to check if I already have the same data at this point), are there any other ways/modules that you would recommend? I am not getting webpages, just JSON and XML data sets.</o>

Stick with LWP::UserAgent and use 'get' and then just output the response to a file? HTTP::Tiny? Mechanize? Something completely different?

Thank you!

  • Comment on LWP::UserAgent Alternatives for API data

Replies are listed 'Best First'.
Re: LWP::UserAgent Alternatives for API data
by haukex (Bishop) on Jul 05, 2020 at 19:30 UTC

    LWP::UserAgent is extremely well established, so yes, you could just stick with that if you figure out the correct options*. Other than that, I'd recommend HTTP::Tiny, or, if you want to step it up a bit, Mojo::UserAgent, which has the advantage that it gives you easy access to JSON and XML/HTML parsing features via Mojo::JSON ($ua->get($url)->result->json) and Mojo::DOM ($ua->get($url)->result->dom).

    * LWP is not my area of expertise, and you also don't say exactly in which way the requests are going wrong (SSCCE, How do I post a question effectively?). Tracing a good request and comparing with a bad one using Wireshark might be very useful here.

Re: LWP::UserAgent Alternatives for API data
by marto (Cardinal) on Jul 05, 2020 at 19:53 UTC
Re: LWP::UserAgent Alternatives for API data
by perlfan (Vicar) on Jul 06, 2020 at 00:50 UTC
    >but it will put a future date for the file date for some reason (July 27, 2020 for example), and then it does not update anymore because the file on the server is newer (but old data)

    Before you switch your web "getter", make your script more resilient to this failure case. Clearly you can't trust the date in the data anymore, therefore you can't trust the data. Once you can inspect the data and detect "old data" (specific case of bad data), then you can retry. You may even wish to keep statistics on when this occurs and its duration. It's clear that their data producing code has its own issues, but that doesn't mean you just have to accept what they give you. Some simple QC in the loop will allow you to detect this and correct it (again via retries).

Re: LWP::UserAgent Alternatives for API data
by johnfl68 (Scribe) on Jul 06, 2020 at 17:04 UTC

    Thanks for the input.

    Sticking with LWP::UserAgent for the time being, in hope that NWS solves all the issues sometime soon.

    I have been using DarkSky for years and there was seldom any issues like this, but that is going away in the future (thanks Apple).

    I reported the issues to NWS, and got added to a large support ticket with many other people having similar issues, which confirms it's not me. Have added Retry coding for all the 500's and 502's, and trying to figure out the best way to check and retry when old data is returned, but I'm not sure an immediate retry will help any. Adding some code downstream to not update to new data if out of date and use slightly older (cached) data instead.

    It's a nightmare. Getting data for 30 cities is taking almost 2 minutes now from Tier-1 Data Center (files are only about 13KB each). I can get the same from DarkSky in seconds with essentially the same code. OpenWeather's API works for some things, but does not provide some of the forecast wording that DarkSky and NWS provide.

    Not many good weather API's left, they all keep going away. AccuWeather is about twice what I am paying for DarkSky, that's my next option but trying to avoid paying that much if I can.

    Again thanks for the help and wisdom!



    Update from NWS: "Since an upgrade API isn't behaving optimally, the 503 errors are a known issue protecting the application. A fix for the application's database is expected no later than late this summer. We apologize for the inconvenience. This ticket will remain open until the issue is resolved. If anyone would like to be removed from the distribution list, please let us know via reply email. Otherwise, you will remain on the distribution list and updates will be provided when available, perhaps in several weeks. (CJJ)"

    So the issues are not going to get fixed anytime soon, so AccuWeather API it is.

      I think marto's reply to a similar question from me is relevant: Re: polishing up a json fetching script for weather data.

      # the API docs says you must identify yourself, please make this somet +hing legit my $name = '(example.com, contact@example.com)'; my $ua = Mojo::UserAgent->new; $ua->transactor->name( $name ); # get JSON response my $json = $ua->get( $url )->res->json->{properties};

      I wonder if we're looking at the same API...are you shy to post your source?

        I said it was the NWS API. Not hiding anything.

        https://api.weather.gov/

        I am working with 'Forecast' data. In the referenced Marto's reply post, they are trying to work with 'Latest Observations' which has even more issues than the Forecast data. I am sticking with the Latest Observations XML data from the NWS that comes from a different source than the NWS API. It has been much more reliable. Unfortunately they don't do the forecast data in XML like the latest observations, or I would use it instead right now.

        The NWS API is newer but it has been out for a few years now. It should be much more stable than it is at this point.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11118939]
Front-paged by davies
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2020-11-27 07:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?