The US government has commissioned IBM to create a massive supercomputer that will have 1.6 million processor cores and be 15 times faster than today's most powerful machine.
The Sequoia supercomputer is scheduled for operation in 2012 and will be able to perform at 20 petaflops, or 20,000 trillion floating point operations per second, IBM said. The fastest supercomputer today, Roadrunner, also built by IBM, at the Los Alamos National Laboratory, can manage 1.1 petaflops.
The cost of the system has not been disclosed, but is likely to run into hundreds of millions of dollars, analysts said.
IBM is building two supercomputers under the contract. The first one, to be delivered by mid-year, is called Dawn and will operate at around 500 teraflops. It will be used by researchers to help prepare for the larger system.
Sequoia will be based on the Blue Gene/Q IBM supercomputer, which is still under development. It will use approximately 1.6 million processing cores, all IBM Power chips, running Linux, which dominates high performance computing at this scale. IBM is still developing a 45 nanometer chip for the system and may produce something with eight or 16 cores - or more - for it. Although the final chip configuration has yet to be determined, the system will have 1.6TB of memory and be housed in 96 "refrigerator-sized" racks.
Ordered by the US Department of Energy, it will be located at the Lawrence Livermore National Laboratory in California and used primarily to manage the US's aging stockpile of nuclear weapons.
Those weapons contain highly corrosive and radioactive materials and Sequoia will allow scientists to perform simulations to help determine whether the weapons are stable and safe, and if they will work properly if the government should decide to use them.
"The problem we have with the nuclear stockpile is similar to one you might have at home with a car you've kept in the garage for 20 to 30 years," said Mark Seager, assistant department head for advanced technology at Lawrence Livermore. "How do you carefully maintain the car as it ages so that when you go to start the car, you can be very confident it will start? That the probability that it won't start is less than one in a million? That's a pretty high level of certitude."
The scientists have been working on the problem for several years with the IBM ASC Purple supercomputer, but they need a more powerful system to explore areas of physics they have not yet tackled and calculate the margin of error for results, Seager said.
Sequoia will occupy 96 server racks over an area a bit larger than a tennis court. IBM won't discuss the machine in detail because it is still being developed, but Dave Turek, vice president of the IBM Deep Computing initiative, said it will be similar in design to its predecessor, Blue Gene/P, but on a much larger scale. The system will run a version of the Linux OS, use IBM's embedded Power processors and have 1.6 petabytes of main memory.
Because a computer this size has never been built, scaling the processor count, memory DIMMs and management subsystems comes with a level of uncertainty, Turek acknowleged. "This is not an exercise for the faint of heart," he said "When you push the limits of scalability you start to observe problems that were simply unanticipated."
Among IBM's challenges will be how to scale the management subsystems to automate as many tasks as possible, and to allow administrators to keep track of workloads and make the right choices during operation.
Lawrence Livermore will have to write applications that can take advantage of such a massively parallel system. It chose the IBM embedded processors because they are "easier to deal with on our complicated weapons code" than the Cell processors used for Roadrunner, Seager said.
Sequoia will be far more energy efficient than a Blue Gene/P system, according to Turek, but because of its size Lawrence Livermore will still have to double the power supply to its computing center. Sequoia will require six megawatts of power, compared to 1.8 megawatts for ASC Purple, Seager said.
IBM was picked from five bidders because its costs were lower and it provided a better "risk reduction plan" -- essentially a backup plan if something goes wrong, Seager said. He declined to name the losing bidders but said it was a close contest.
Besides managing nuclear weapons, Sequoia will be used for research into astronomy, energy, the human genome and climate change, IBM said. The system will allow forecasters to predict local weather events that are less than one kilometer across, it said, compared to 10 kilometers today.
While it is being built, Lawrence Livermore will use a smaller IBM supercomputer called Dawn to develop the weapons applications that will run on Sequoia. Dawn will be operational in the coming months and perform at 500 teraflops.
It's not certain that Sequoia will be the most powerful supercomputer in the world by the time it goes into operation, but Turek sounded confident that it will be.
"We expect it to be, Livermore expects it to be," he said. "At this rarefied level of computing there are few clients around the world looking to make the investment on this scale."