Over the past fourteen years, a wide range of processors, accelerators
and communications systems have been evaluated.
Current clusters are all based on Infiniband, with the most recent
clusters at JLab and FNAL including GPU acceleration.
The conventional (non-GPU) clusters provide in 2016 a total throughput
of approximately 62 TFlops on lattice QCD production code.
(This is the equivalent of 280 TFlops in terms of the Linpack
GPU-accelerated clusters at JLab and FNAL provide a total of 128
TFlops and an Intel MIC-accelerated cluster at JLab provides a total
of 8 TFlops on lattice QCD production code.
The most recent conventional cluster "12s" built at JLab is shown
It consists of 276 nodes of dual Intel Sandy Bridge (octo-core) CPUs
connected via QDR InifiniBand switched networks, with a throughput
of 14 Tflops on lattice QCD code.
pi0, the latest conventional cluster built at FNAL, consists of 314
nodes of dual Intel Ivy Bridge (octo-core) CPUs connected via QDR
InifiniBand switched networks, with a throughput of 19 Tflops on
lattice QCD code.
This cluster is shown at the right.
In 2016 BNL is joining JLab and Fermilab in deploying and operating
clusters for the use of USQCD. It is installing a cluster containing
around 200 dual K80 GPU nodes. Brookhaven currently hosts a
Blue Gene/Q for USQCD. It formerly hosted the QCDSP and QCDOC
purpose-built computers. These were designed and constructed with
Columbia University and led to the creation of IBM’s Blue Gene line
of scientific supercomputers.