http://qs321.pair.com?node_id=979253

When I opened Worst Nodes today, I was greeted with these two error messages where "Worst nodes of the day" and "Worst nodes of the week" should have been:

Internal problem occurred in get_picked_nodes You have an error in your SQL syntax; check the manual that correspond +s to your MySQL server version for the right syntax to use near ' 0, +1, 66, 1341080989 )' at line 7 Internal problem occurred in get_picked_nodes You have an error in your SQL syntax; check the manual that correspond +s to your MySQL server version for the right syntax to use near ' 0, +7, 66, 1341080989 )' at line 7

It usually disappears with a page refresh, but I thought I would bring it to your attention anyways.
Why is this?

~Thomas~
confess( "I offer no guarantees on my code." );

Replies are listed 'Best First'.
Re: Errors in Worst Nodes
by ww (Archbishop) on Jun 30, 2012 at 19:03 UTC
    The non-technical explanation, in complete detail: "S*** happens!"

    The technical explanation isn't much different. Server hiccups. MySQL glitches. Xmsn errors. Power dips. Flaky RAM. Disk errors....

    A "page refresh" may yield the (apparent) result of clearing the error, but it doesn't have much to do with the underlying problem(s).

    BTW, this is a complicated (by some standards) system. In a prior life, dealing with mechanical critters like airplanes, we frequently cited another truism: "Aircraft are complicated mechanical systems. Complicated systems break." With modest paraphasing, that applies here, too.

      The technical explanation isn't much different. Server hiccups. MySQL glitches. Xmsn errors. Power dips. Flaky RAM. Disk errors....
      There would have to be quite a few of them -- I've been seeing it almost daily.

      A "page refresh" may yield the (apparent) result of clearing the error, but it doesn't have much to do with the underlying problem(s).
      I know, I just had to take the Microsoft way(where restarting makes problems go away, but then refresh here), as there's not much I could do at the moment.

      ~Thomas~
      confess( "I offer no guarantees on my code." );
        Yes, sometimes there are an annoying, troublesome, puzzling "quite a few" of them but, OTOH, the annoying, troublesome, puzzling "quite a few" of them could all get thrown because a single adverse occurance is repeating itself.

        In any case, the keepers of the Monastery are aware... and are using their stockpile of (name your favorite insecticide) to try to stop any further repetitions.

        Thanks for your concern. Wish I had happier answer.

Re: Errors in Worst Nodes - Leap second bug?
by flexvault (Monsignor) on Jul 01, 2012 at 13:02 UTC

    thomas895 and all Monks,

    The following information concerning last night's (for me - EDT) leap second bug, I received from outages.org:

    Subject: [Re:][outages]Java apps around the globe are crashing... Sender: outages-bounces@outages.org Date: 06/30/12 10:20 PM . . . However, it does not seem to be a Java bug -- so far, it looks like something is causing futex() to timeout, instead of telling the thread to sleep [1], causing issues on anything that uses it (e.g., java, chrome, mysql). It's not clear exactly what variable (i.e., kernel verson, distro) causes boxes to go haywire. It may just be a race condition which some people hit due to bad luck. But it is certainly related to the leap second. [1] https://lkml.org/lkml/2012/6/30/122

    Checking our locations, many 1st and 2nd level ntpd servers are down. Following the thread, it doesn't seem that the same 'fix' works for every one. Some were able to shut down ntpd and restart, others had to reboot, and for some machines even after reboot, they aren't working correctly.

    Our ntpd servers were not affected, but many source servers are off-line.

    To sum it up, if getting a few error messages is the worst that PM has, I think the site did very well.

    --

    Another day in the computer age!

    "Well done is better than well said." - Benjamin Franklin

      Wonderful example of "complicated systems break" (and, since I assume there's no intention of attributing PM issues to the leap second, ++).