Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Paging with REST::Client?

by Argel (Prior)
on Jan 11, 2020 at 01:10 UTC ( [id://11111314]=perlquestion: print w/replies, xml ) Need Help??

Argel has asked for the wisdom of the Perl Monks concerning the following question:

After so much fun with Forking and shelling out to curl (and thanks for all the help), I now have access to a RHEL 7 box with lots of packaged modules, including REST::Client, which someone in my company managed to get officially blessed and packaged.So, great!! Much cleaner than using curl, etc. Allowed me to write a new daemon that checks for queued changes and decides whether to kick off a DHCP restart. It usually works great, but there are times when we are importing a ton of changes (e.g. 1800+ ) and I'm hitting the max amount of data REST::Client will accept.

When I was shelling out to curl, the way I was handing this was using _max_results and _paging with the REST calls (and checking if I needed to loop ove the _paging calls). So, I'm wondering if there's an easy way to pull this off with REST::Client? It doesn't look like it, however I can pass in my own LWP::UserAgent object -- is there a way to have LWP::UserAgent handle the paging under the hood? Guessing not based on my supersearch and duckduckgo searches, but thought I'd ask. Assuming not, is there another Perl module designed to handle this while providing a clean, simple interface?

Elda Taluta; Sarks Sark; Ark Arks
My deviantART gallery

Replies are listed 'Best First'.
Re: Paging with REST::Client?
by bliako (Monsignor) on Jan 11, 2020 at 15:06 UTC
    and I'm hitting the max amount of data REST::Client will accept. 

    What do you mean? Does the module REST::Client impose a limit or does the server impose a limit?

    What I understand is that either the server has paging into their API, so that you can do something like $client->GET('http://example.com/dir/file.xml?page=10&num_pages=1'); in a loop to get all the pages. OR the server has no paging to offer and you get all the data at once. In which case REST::Client has the option to save it to a file if handling it in Perl, as a variable, will cause you problems. The file can then easily be paged in the usual ways.

    LWP::UserAgent allows you to provide your own callback functions to be called before and after each phase of handling the requests. But I can't see how this can be useful in your case. What I thought would be possible is to ask user-agent to give you a data-socket and do what you want with it but don't know how or if it makes sense.

    Edit: How to make REST::Client save to a file from its documentation:

    # request responses can be written directly to a file $client->setContentFile( "FileName" ); # or call back method $client->setContentFile( \&callback_method );
      Thanks for taking a closer look. I skimmed over the callback stuff in LWP::Useragent and it didn't leave me optimistic. Might be more useful if I write this using just LWP::UserAgent, but I don't think I will have time for that.

      I'll look into writing to a file. My guess is that I'm going to hit the max data cap regardless, unless REST::Client is setting a cap to protect the user.

      Hmm, maybe I'm thinking about this the wrong way. If I hit max data then I know I have pending changes, so a restart is needed. So I know I hit max data I can proceed with my processing (checking if a restart is already in progress). I'll check what error information REST::Client gets me and see what I can do with that.

      Elda Taluta; Sarks Sark; Ark Arks
      My deviantART gallery

        I'll look into writing to a file. My guess is that I'm going to hit the max data cap regardless, unless REST::Client is setting a cap to protect the user.

        Hi

        Can you explain what "max data cap" means?

        Because its not in the URL vocabulary, HTTP vocabulary, REST vocabulary, and predictably, not even in perl LWP module family or REST::Client vocabulary

        Whats is it that you're describing exactly?

        I know its not a sticker on the window of your car , right?

Re: Paging with REST::Client?
by Your Mother (Archbishop) on Jan 13, 2020 at 16:09 UTC

    I don’t know how simple this can be made so the best thing to do is make it clean, as you say. I have used Data::Page to manage this in search results. It’s built-in for DBIx::Class::ResultSet->pager but you can wrangle it manually for anything. It’s not automatic for your case, you have to do all the plugging in of the data and wrapping the requests, but it is semantic and clear/clean to use the actual paging objects.

    This may just be one of those spaces that is variable enough, har-har, to elude easy encapsulation/automation. Regarding hitting the max amount of data REST::Client will accept, RC->isa("LWP::UserAgent") (<- update, double checked that’s not right, the top level object has a UA, it isn’t inheriting from it) so it has no inherent limit unless you set $client->max_size(…). If there is another way to check on why it’s bottoming out, there might be a way around it.

      I'll see if I can control the max amount of data in REST::Client. If not, then Data::Page looks interesting. I'm hoping to merge another script into the restart daemon in the next couple of months, so I may have time to overhaul the script then. Thanks for the suggestions!!

      Elda Taluta; Sarks Sark; Ark Arks
      My deviantART gallery

Re: Paging with REST::Client?
by Anonymous Monk on Jan 11, 2020 at 07:51 UTC

    When I was shelling out to curl, the way I was handing this was using _max_results and _paging with the REST calls (and checking if I needed to loop ove the _paging calls). So, I'm wondering if there's an easy way to pull this off with REST::Client? It doesn't look like it, however I can pass in my own LWP::UserAgent object -- is there a way to have LWP::UserAgent handle the paging under the hood? Guessing not based on my supersearch and duckduckgo searches, but thought I'd ask. Assuming not, is there another Perl module designed to handle this while providing a clean, simple interface?

    Hi

    So are _paging _max_results curl options? Http headers? Something else?

      I'm not an expert on RESTful APIs, so I'm not sure what specifically is supplying the functionality, but it's on the server side of things. It's something passed in via the URL. Here's an example URL from some debug output I have lying around:
      https://nios.mycompany.com/wapi/v2.7/range?network=10.20.30.0/24\&star +t_addr=10.20.30.40\&_return_fields%2B=disable,extattrs,network\&_pagi +ng=1\&_max_results=500\&_return_as_object=1

      Elda Taluta; Sarks Sark; Ark Arks
      My deviantART gallery

        just a shot in the dark, the server may be sending data keyed on "address" (10.20.30.40). In order to get to the next 500 results you have to use the last "address" you got from the last results page and send this as "start_addr".

        sort of! In the link below, the server returns a reference to send it back to get the next page: perhaps you can figure it out by searching for paging in this page: https://ipam.illinois.edu/wapidoc/additional/sample.html?highlight=paging

        Hi try adding &page=2


        The way forward always starts with a minimal test.

        I'm not an expert on RESTful APIs, so I'm not sure what specifically is supplying the functionality, but it's on the server side of things. It's something passed in via the URL. Here's an example URL from some debug output I have lying around:

        Hi

        ;) Have you read what you wrote?

        Surely you know exactly the actions you took, cause you say

        "passed in via URL"

        Well, there you go, the module does accepts URLs https://metacpan.org/pod/REST::Client#GET-(-$url,-%5B%25$headers%5D-)

        So what happened when you did that (pass via url)?

        "paging" ain't nothing but a loop (foreach) where you ask for some more

Re: Paging with REST::Client?
by 1nickt (Canon) on Jan 15, 2020 at 15:30 UTC

    Just want to say props to bliako and the anonymonk for sticking with it here!


    The way forward always starts with a minimal test.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11111314]
Approved by Athanasius
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2024-04-23 21:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found