Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Sometimes its not heat but "cold".

So i was working at IBM/Kingston on the AIX/ESA project. They were putting in a new box with 6 ESA engines. Unknown to us ,for some reason they(NSD) decided to pipe the cold "chiller-water" to it using a feed that wasn't right (7 gal/min rather than 14?), it was all they could find without running new pipes. In the beginning it was physically partitioned into two 3-core "boxes" and while one of them was used a bunch the other was mostly idle for about a month.

So i start to bring up the second partition, and i stress test it. And overheat alarms start going off all over the place. I mean BELLS, audible alarms as well as "hot boxes" on the console. I shut down the stress test, they go away, start it again and they come back.

I dont understand, but i go and chat about it with my contact to IBM's National Service Division (NSD). He has no clue and gets in touch with them and finds out about the pipe missmatch. He tells them they f'ed up and to fix it. In a week or so they say they have, and i run the stress tests on the second partition and have no problems.

Fast forward a few months, BIG demo, IBM VPs, people from computer world, infoweek, etc in to watch rotating globes and planes flying around, and maybe 6 other compute intensive tasks on each of about 2 dozen pcs, all coupled to the new box, now configured as as sigle 6-core box. Everything but the X-clients was on the ESA box. it was running about 99%. It was running for about 3 hours before the honchos came in to watch it and get the PR spiel. About an hour in the operator comes in and points out an over-heat hot-box on the system console in the far corner. NOT GOOD. I shut down a bunch of non-needed tasks, but the hot-box didnt go away, Its warning that it is going to shut down unused sections to lower the heat load.

So me and the op are quietly chatting, Im sure not going to shut down the demo, we'll just let it do what it wants and hope it doesnt crash. A VP walks over complaining about the noise. I point out the hot-box and explain what we were going to do, he agrees and walks away, but not happy. We quit chatting so as not to draw any further attention. The demo finishes with the hot box still up, but no magic smoke anywhere.

When i ran the stress tests on the second partition the first wasnt running at full load. and while we tested the demo, we didnt run the 6-way for hours and hours and hours so it didnt show the overheat warning. My NSD contact is pissed, its supposed to be fixed. He calls NSD and they promise to figure it out. And they get back soon. It seems the plumbing is still undersized, only 11, more then the 7, but not the 14 we needed. The VP as pissed, the AIX/ESA leader was pissed. my NSD contact was pissed, I was pissed, the operator was pissed, NSD fixed it pretty quick, i ran the demo for about 12 hours with no problems.

And it turns out the VP was not pissed at me, or the operator. He gave us a nice writeup about how we saved the demo.

While im at it, that 6-way figures into another interesting story. We would tightly-couple it to 3 other ESA 6-ways, 1 local-channel, 2 offsite channels over PVM. and run benchmarks to compare to MVS and VM. One night they want to do it again, but its Thanksgiving morn 12-6am. (and i needed about 90 min to patch it all together first) Can i come in to set it up? They offered me double-time plus time off to do it even, cuz it was thanksgiving. I said sure.

The IBM Fellow for super-computing was in Kingston, i met him a few times, he was the one that made the benchmarks we ran, he was an MVS fan. He comes to talk to me later, and tells me that thanksging i had booted the "largest box" in the world (at that time). All the other super computers were down for maint, or physically partitioned into smaller sections that day. He was right, it was real cute to know i was running the fastest computer in the world, if just for a little time.

In reply to Re^2: Shared DBI handle supporting threads and processes by huck
in thread Shared DBI handle supporting threads and processes by marioroy

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2023-02-02 18:47 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (20 votes). Check out past polls.