USQCD: US Lattice Quantum Chromodynamics

Over the past fourteen years, a wide range of processors, accelerators and communications systems have been evaluated. Current clusters are all based on Infiniband, with the most recent clusters at JLab and FNAL including GPU acceleration. The conventional (non-GPU) clusters provide in 2016 a total throughput of approximately 62 TFlops on lattice QCD production code. (This is the equivalent of 280 TFlops in terms of the Linpack benchmarks.) GPU-accelerated clusters at JLab and FNAL provide a total of 128 TFlops and an Intel MIC-accelerated cluster at JLab provides a total of 8 TFlops on lattice QCD production code. The most recent conventional cluster "12s" built at JLab is shown at left. It consists of 276 nodes of dual Intel Sandy Bridge (octo-core) CPUs connected via QDR InifiniBand switched networks, with a throughput of 14 Tflops on lattice QCD code. pi0, the latest conventional cluster built at FNAL, consists of 314 nodes of dual Intel Ivy Bridge (octo-core) CPUs connected via QDR InifiniBand switched networks, with a throughput of 19 Tflops on lattice QCD code. This cluster is shown at the right.

In 2016 BNL is joining JLab and Fermilab in deploying and operating clusters for the use of USQCD. It is installing a cluster containing around 200 dual K80 GPU nodes. Brookhaven currently hosts a Blue Gene/Q for USQCD. It formerly hosted the QCDSP and QCDOC purpose-built computers. These were designed and constructed with Columbia University and led to the creation of IBM’s Blue Gene line of scientific supercomputers.

Computing hardware and software

Clusters