Cheap Supercomputers: LANL has 750-node Raspberry Pi Development Clusters
by Ian Cutress on November 14, 2017 2:30 PM EST- Posted in
- Servers
- HPC
- Enterprise
- Trade Shows
- SC17
- Supercomputing 17
- Raspberry Pi
- Pi
- LANL
One of the more esoteric announcements to come out of SuperComputing 17, an annual conference on high-performance computing, is that one of the largest US scientific institutions is investing in Raspberry Pi-based clusters to aid in development work. The Los Alamos National Laboratory’s High Performance Computing Division now has access to 750-node Raspberry Pi clusters as part of the first step towards a development program to assist in programming much larger machines.
The platform at LANL leverages a modular cluster design from BitScope Designs, with five rack-mount Bitscope Cluster Modules, each with 150 Raspberry Pi boards with integrated network switches. With each of the 750 chips packing four cores, it offers a 3000-core highly parallelizable platform that emulates an ARM-based supercomputer, allowing researchers to test development code without requiring a power-hungry machine at significant cost to the taxpayer. The full 750-node cluster, running 2-3 W per processor, runs at 1000W idle, 3000W at typical and 4000W at peak (with the switches) and is substantially cheaper, if also computationally a lot slower. After development using the Pi clusters, frameworks can then be ported to the larger scale supercomputers available at LANL, such as Trinity and Crossroads.
“It’s not like you can keep a petascale machine around for R&D work in scalable systems software. The Raspberry Pi modules let developers figure out how to write this software and get it to work reliably without having a dedicated testbed of the same size, which would cost a quarter billion dollars and use 25 megawatts of electricity.” Said Gary Grider, leader of the High Performance Computing Division at Los Alamos National Laboratory.
The collaboration between LANL and BitScope was formed after the inability to find a suitable dense server that offered a platform for several-thousand-node networking and optimization – most solutions on the market were too expensive, and anyone offering something like the Pi in a dense form factor was ‘just people building clusters with Tinker Toys and Lego’. After the collaboration, the company behind the modular Raspberry Pi rack and blade designs, BitScope, plans to sell the 150-node Cluster Modules at retail in the next few months. No prices were given yet, although BitScope says that each node will be about $120 fully provisioned using the element14 version of the latest Raspberry Pi (normally $35 at retail). That means that a 150-note Cluster Module will fall in around $18k-$20k each.
The Bitscope Cluster Module is currently being displayed this week at Supercomputing 17 in Denver over at the University of New Mexico stand.
Related Reading
Sources: BitScope, EurekAlert
26 Comments
View All Comments
jptech7 - Wednesday, November 15, 2017 - link
As someone who built a 64-node Pi 3 cluster (and profiled the performance for an HPC paper), the scaling in typical HPC workloads is quite poor due to the 100 Mbit Ethernet. You can also use the USB interface to a 1 Gbit adapter, but the USB is 2.0 and thus limited to less than half of a real Gigabit solution. The limited RAM per node also doesn't help.Elstar - Wednesday, November 15, 2017 - link
I think the crappy network bandwidth is what makes this setup interesting and useful to LANL. As the old joke goes: "supercomputers turn computational problems into I/O problems."Arnulf - Wednesday, November 15, 2017 - link
Why can't they do testing on simple many-core single-chip which costs way less than this useless toy? This is a waste of money ($120 per RaspberryPi board?) and power.kaidenshi - Wednesday, November 15, 2017 - link
$120 is for each node, and each node has four Raspberry Pi boards, not one. That puts the cost per Pi at well under retail.Mo3tasm - Tuesday, November 14, 2017 - link
Not sure the rational behind using SD cards to boot them but I could imagine network boot will do much better job.. but, who knows??!!wolrah - Tuesday, November 14, 2017 - link
The Pi 0/1/2 do not support network booting, and the Pi 3's network booting has some fairly significant bugs which make it hard to use with high-end switches.https://www.raspberrypi.org/blog/pi-3-booting-part...
Amusingly it works best with the dumbest switches, but that doesn't really intersect with most of the roles where people would be trying to netboot the things.
There is an updated bootloader available, but there's no way to load it permanently to the hardware. You have to load it off of a SD card (though this does mean it also works on the older models).
Hopefully the Pi 4 or whatever replacement may come around has netbooting done right.
CityBlue - Tuesday, November 14, 2017 - link
Seems odd to build a "dense" cluster from Raspberry Pi Model B PCBs connected into a backplane. If the goal is to maximise density wouldn't it make more sense to use the Compute Module 3 plugged into a backplane that provides IO in a more optimal way (ie. built-in switch)? They could probably double the density, or halve the rack space requirement.DanNeely - Tuesday, November 14, 2017 - link
If the objective was to fill racks with them yes; but the stated purpose of this appears to be single units provided to people writing supercomputer code to test scaling during development. Enough cores to prove the workload scales approximately linearly with core count and as cheap as possible are equally important. The more they can plug off the shelf components together instead of engineering custom hardware the better for cost.If at some point in the future someone does express an interest in buying a few dozen racks worth of them then it might be worth engineering more custom parts to ramp the density up.
DanNeely - Tuesday, November 14, 2017 - link
All of that is assuming that the Compute Module was even an announced product when this began. Personally I think they missed the boat in terms of power consumption here. 250 Pis and 1000 cores should still be enough to test scaling; and at only ~1300W it could be plugged into a standard outlet in an office/lab instead of needing to go through the university bureaucracy to get it installed in an official data center.Qwertilot - Wednesday, November 15, 2017 - link
Well worth having that last bit :)