Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I'd argue that 404 should be rechecked too, though most likely, any site that starts off with a 404 error will end up off the list, more so than 408s, 500s, or connection problems. Sometimes, if you've linked 'deep' into a site (anywhere off the front page, or in a user's account), the server's storage might be switched around, and in a short time frame, you might get 404s, but outside, the page would be accessible normally. There's other reasons that I can think of as well, which are not unlikely but are uncommon, that I'd check pages repeatedly regardless of error.

That said, it certainly would not be too hard with such a tool to report in a log file why links were removed, allowing for the person to chase down those that might be recoverable (404s commonly), as opposed to those that are probably lost for good (no connection over serveral attempts).

-----------------------------------------------------
Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
"I can see my house from here!"
It's not what you know, but knowing how to find it if you don't know that's important


In reply to Re: Re: Re: Verifying external web links by Masem
in thread Verifying external web links by nu2perl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2024-04-23 16:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found