Read file after download

kepler has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Read file after download by hippo (Bishop) on Jul 01, 2020 at 22:00 UTC
I guess you mean something like the below. Obviously add your own error handling but this should point you in the right direction I would hope. `#!/usr/bin/env perl use strict; use warnings; use LWP::UserAgent; use Digest::SHA 'sha256_hex'; my $response = LWP::UserAgent->new->get ('https://www.perlmonks.org/?n +ode_id=11118766'); my $digest = sha256_hex ($response->decoded_content); print "The digest is '$digest'\n";` [download]	[reply] [d/l]
Re: Read file after download by marto (Cardinal) on Jul 01, 2020 at 18:22 UTC
What is the data? HTML, XML, MP3? What error are you getting? How do I post a question effectively?.	[reply]
Re^2: Read file after download by kepler (Scribe) on Jul 01, 2020 at 19:01 UTC
Hi Thanks for answering. No matter the data - it can be json, text file, etc. The process of the download (or for get) the full data is slower than the routine to process it, which must be called after that. I achieved once this with a callback function in a LWP http request, but I am not been able to do repeat it again.	[reply]
Re^3: Read file after download by marto (Cardinal) on Jul 01, 2020 at 19:34 UTC
Do you have some example URLs? How many do you have? Can you show an example of the processing routine? Perhaps Mojo::UserAgent in conjunction with Mojo::Promise (see the Mojo::UA example) or Mojo::IOLoop would help runtime.	[reply]
Re: Read file after download by bliako (Monsignor) on Jul 02, 2020 at 15:12 UTC
LWP::UserAgent blocks until all data is received (perhaps you are used to javascript's ajax). So when it unblocks you have all your data. But since you also mentioned in one of your replies, a callback, LWP::UserAgent additionally offers two alternatives to processing downloaded content, especially suited for LARGE files. The first one is to specify a save-to filename in the `get()` call, in the form of a pseudo-header directive. The benefit is that the LARGE content goes straight to the filesystem and does not clogg your memory. The second one is to specify a callback function to be called when some content has been received (think LARGE chunked downloads). Again in the same way of pseudo-headers. This is useful for on-the-fly, streamed data processing, say you want to uncompress data as it is received. Both of the above methods are documented in LWP::UserAgent, search for `:content_cb`. Also there is the `progress()` callback which is called occassionally during the request to let you know on the progress of the download.	[reply] [d/l] [select]
Re: Read file after download by perlfan (Vicar) on Jul 02, 2020 at 05:38 UTC
I recommend using HTTP::Tiny's `mirror` method. LWP::Simple also has `getstore`. You'll want to check the `status` of the response to determine if it was fully saved. You can also get the expected length of the file in the headers and check that when the download finishes without error. Some sort of verification of the file downloaded is always a good idea, however you do it. I don't know how either module deals with chunked content. This won't matter unless you're pulling from and endpoint that may potentially chunk the responses.	[reply] [d/l] [select]


go ahead... be a heretic
	PerlMonks