June 30, 2006
Old Supercomputers Don't Die...
Every office has them. Back rooms with stacks of keyboards, rows of 15-inch monitors, piles of desktop CPUs, and miles of cables, just for starters. That's fine for PCs, but what do you do with over-the-hill supercomputers?
That's the question that Sandia National Labs faces, as it prepares to put ASCI Red, at one time the world's first teraflop supercomputer, out to pasture.
"I've never buried a computer before," said Justin Rattner, Intel Chief Technology Officer.
Sandia vice-president Rick Stulen added, "ASCI Red broke all records and most importantly ushered the world into the teraflop regime. It still holds the record for the longest continuous rating as the world's fastest computer, four years running." But in the "what have you done recently for me" world that we live in, ASCI Red was still decommissioned.
ASCI Red was a critical part of NNSA's Advanced Simulation and Computing (ASC) program. The simulation capabilities developed by the ASC program, and conducted on supercomputers like ASCI Red, provide the nuclear weapons and materials analysis that NNSA needs to keep the nuclear weapons stockpile safe, secure. and reliable without underground nuclear testing.
ASCI Red first broke the teraflops barrier in December, 1996 and topped the world-recognized LINPAC top-500 computer speed ratings seven consecutive times from June 1997 to June 2000. (A teraflop is a trillion operations per second.) Originally rated at 1.6 teraflops, a chip upgrade raised it to 3.1 t-flops just when it looked as though its world supremacy would be lost.
Sandia director Bill Camp said that ASCI Red had the best reliability of any supercomputer ever built, and "was supercomputing's high-water mark in longevity, price, and performance."
"It was almost mystical in scalability," said another Sandia director, Rob Leland. "All these other machines would be tailing off and Red would still be cruising along,"
"When we first talked about running a machine with 10,000 processors, it seemed ludicrous," Rattner said, apparently anticipating massive downtimes. But instead of 27 hours average time between hardware-caused interrupts--the figure predicted in the design phase--Red achieved an average of several hundreds hours.
Sandia researcher Michael Hannah running, emphasized that the machine was not being decommissioned because of technical problems. "It's not a reliability issue, because ASCI Red is still reliable," he said. "It is about getting more bang for the buck with nine-year-newer technology and terminating significant costs in space, power, and cooling."
No, old supercomputers like ASCI Red don't die, they just scale down.
Posted by Jon Erickson at 09:58 AM Permalink
|