I'd argue that 404 should be rechecked too, though most likely, any site that starts off with a 404 error will end up off the list, more so than 408s, 500s, or connection problems. Sometimes, if you've linked 'deep' into a site (anywhere off the front page, or in a user's account), the server's storage might be switched around, and in a short time frame, you might get 404s, but outside, the page would be accessible normally. There's other reasons that I can think of as well, which are not unlikely but are uncommon, that I'd check pages repeatedly regardless of error.
That said, it certainly would not be too hard with such a tool to report in a log file why links were removed, allowing for the person to chase down those that might be recoverable (404s commonly), as opposed to those that are probably lost for good (no connection over serveral attempts).
-----------------------------------------------------
Dr. Michael K. Neylon - mneylon-pm@masemware.com
||
"You've left the lens cap of your mind on again, Pinky" - The Brain
"I can see my house from here!"
It's not what you know, but knowing how to find it if you don't know that's important
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|