Building USQCD Libraries Using qinstall
The various USQCD software
libraries may be built easily using the qinstall utility.
In the steps below, we detail how to get and configure
qinstall to build the full set of libraries (qmp,
qio, qla, qdp, qdp++, qdpqop) as well as
the chroma application.
- Download qinstall.
At the present time, qinstall should be
obtained from the JLab CVS repository.
Anonymous access may be used to checkout the qinstall module
as follows:
export CVSROOT=:pserver:anonymous@cvs.jlab.org:/group/lattice/cvsroot
cvs login
(at the password prompt hit return)
cvs co qinstall
cd qinstall
- Configure qinstall.
In the qinstall directory you will find
various profile (.prf) files, as well as a README and
subdirectories for each of the libraries and for the chroma
application. Each .prf file defines a set of
variables that determine where the download, source, build,
and install directories for each product will be located.
The files also list the products to be built, and for each
product the name of the specific configuration file to be
used, the tag to be applied to the download, source, build,
and install directories, and a flag for whether the
regression test of each package should be executed.
The .prf files currently defined are:
- core2.prf and core2single.prf -
specifications for building MPI and single-node
versions for the
Intel core2 architecture
- jlab-6n and jlab-7n - specifications for
building versions for the JLab 6n (dual core
Pentium D with Infiniband) and 7n (quad core
Opteron with Infiniband) clusters
- k8 and k8-single - specifications for
building MPI and single-node versions for the AMD
Opteron architecture
- p4 and p4-single - specifications for
building MPI and single-node versions for the 32-bit
Intel Pentium architecture.
Pick the .prf file that comes closest to
your target architecture. Feel free to modify the file to
add specific tags to modify the products that will be built.
Under each of the product directories, you will find
configure script arguments tailored for the specific
architecture defined by the .prf file. To
illustrate the process, in this document we will build an
MPI version of the libraries for the Opteron
(k8) processor. To do this, start with
k8.prf, modifying the rootdir,
SRCROOT, DLDIR, BLDROOT, and
INSROOT variables to point to where we want to build
and install the products. Let's call the modified file my-k8.prf
Note that because of the dependencies between the
products, you should invoke qinstall to
download and build the products in this order:
- qmp
- qio
- qla
- qdp
- qdp++
- qopqdp
- chroma
- Build QMP.
In this example, k8.prf specifies that the
mpi-mpicc-gcc configure options should be
used to build QMP. Examining
qmp/mpi-mpicc-gcc, we see that the compiler is
specified as simply mpicc, so we must insure that the
desired mpicc is in our path.
To build a specific version of QMP, we use the
command
./qinstall <prf> <package> <version>
For this example, we will follow the version recommendations
for MILC 7.6.0.1, found here.
So, after making sure that mpicc is in our path, and
that the source, download, build, and install directories
that we gave in our .prj exist, we
give the command
./qinstall my-k8 qmp 2.1.7
qinstall will download the qmp source tar
file from the USQCD repository, if necessary, configure the
build, perform a make and a make install,
and optionally perform a make check. Note that you
should not include the .prf extension on
my-k8.
qinstall will output the stdout and stderr
of all of the steps used to build the product.
- Build QIO and QLA.
Following the same steps as for QMP above, invoke
qinstall to build QIO and QLA
as follows:
./qinstall my-k8 qio 2.2.0
./qinstall my-k8 qla 1.6.2
Note that building QLA will take considerable time.
- Build QDP and QDP++.
QDP, QDP++, qopqdp, and chroma
require at least version 2.61 of autoconf. On the
Fermilab Kaon cluster, these are installed in
/opt/bin. Make sure that this version or newer
of autoconf is near enough to the front of your path to be
automatically invoked.
Use qinstall as above to build QDP
and QDP++:
./qinstall my-k8 qdp 1.7.0
./qinstall my-k8 qdp++ 1.25.1
Note that building both of these libraries will take
considerable time.
Note that qdp++ requires a gcc 4.1.x or newer compiler.
On the FNAL Kaon cluster, gcc-4.1.1 is available
at /opt/gcc-4.1.1/bin.
- Build QOPQDP.
Invoke qinstall as follows to build
QOPQDP:
./qinstall my-k8 qopqdp 0.10.1
- Build chroma.
Invoke qinstall as follows to build the
chroma application:
./qinstall my-k8 chroma 3.28.3
Note that chroma requires a gcc 4.1.x or newer compiler.
On the FNAL Kaon cluster, gcc-4.1.1 is available
at /opt/gcc-4.1.1/bin.
Building MILC and DWF Benchmarks
- Building the MILC asqtad Benchmark.
- Download
milc_qcd-7.6.0.1.tar.gz from the
MILC distribution page
- Make a new directory, and untar the MILC source tree there:
mkdir milc_sample
cd milc_sample
tar -xzf ../milc_qcd-7.6.0.1.tar.gz
make -f Make_unpack all
rm -f *.tar
- Enter the
libraries/ directory and
edit Make_vanilla. You will want
to modify as necessary sections 1, 2, and 3 to reflect
your hardware and installed software. For a standard
build on an Opteron system, you can use
CC = gcc
OPT = -O3
OCFLAGS = -march=opteron -ffast-math -funroll-loops \
-fprefetch-loop-arrays -fomit-frame-pointer
- Enter the
ks_imp_dyn/ directory and create a
makefile for your build:
cd ks_imp_dyn
cp ../Makefile Makefile.sample
Edit Makefile.sample. First, you
should change MAKEFILE near the top to be
the name of your makefile (Makefile.sample).
Next, to build a parallel (MPI)
binary, in section 1 uncomment MPP = true. In section 4,
define CC to point to your mpicc.
Modify sections 5 and 6 to match your changes to
libraries/Make_vanilla above. In
section 10, change the variables WANTQDP, WANTQIO, and
WANTQMP to true. Edit the
SCIDAC, QIOPAR, QMPPAR,
QDP, and QLA variables to point to your installed
builds of the SciDAC libraries from above. In section
15, edit CTIME to include at least
-DCGTIME.
- In
ks_imp_dyn/, build the
su3_rmd binary using your makefile:
make -f Makefile.sample clean su3_rmd
- Test your binary.
Create a sample input file:
cat > test.in
prompt 0
nflavors1 2
nflavors2 1
nx 8
ny 8
nz 8
nt 8
iseed 5682304
warms 0
trajecs 1
traj_between_meas 1
beta 6.85
mass1 0.01
mass2 0.01
u0 0.8441
microcanonical_time_step 0.01
steps_per_trajectory 5
max_cg_iterations 1000
max_cg_restarts 5
error_per_site 0.5e-4
error_for_propagator 0.5e-4
fresh
forget
^D
Run your binary:
mpirun -np # ./su3_rmd test.in
To vary the problem size, change the values of
nx, ny, nz,
and nt in test.in. The
output log of su3_rmd will include lines
like:
CONGRAD5: time = 5.772495e-02 (fn_qdp F) masses = 1 iters = 22 mflops = 9.264879e+02
This line gives the performance of the code in the
conjugate gradient for one solution. Note that this
performance is per process, so to obtain the aggregate
performance multiply the mflops value by
the number of MPI processes.
- Building the SSE DWF Benchmark.
For benchmarking DWF code, we use the conjugate
gradient timer for SSE code,
t_dwf_cg2 included in the
other_libs/ directory of
chroma. In the SciDAC library
build procedure above, the SSE library necessary for
t_dwf_cg2 is produced, but not the
actual timer binary. To do so:
- Go to the
chroma/other_libs/ build
directory from the procedure above:
cd ~/scidac_build/build/chroma-3.28.3/other_libs
- Edit the
Makefile to change
CC = gcc
to
CC = mpicc
- Build the
t_dwf_cg2 binary:
make t_dwf_cg2
- Test your binary:
mpirun -np 8 ./t_dwf_cg2 -Lx 8 -Ly 8 -Lz 8 -Lt 4 -Ls 16 \
-maxCG 200 -qmp-geom 2 2 2 1
You should see an output like this:
QMP m0,n8@kaon2009: DWF init: Version 1.3.3 (sse float)
Nproc: 8 Global: 8 8 8 4 Ls: 16 Local: 4 4 4 4 : iter: 100 Nflops: 9.5403e+09 \
dt: 0.820318 floppage_total: 11630 flops_per_process: 1453.75
Note that the "floppage_total" and "flops_per_process" numbers reported
by this benchmark are in units of MFlop/sec.
|