All Hands' Home
Call for Proposals
HEP Science for SciDAC 3
NP Science for SciDAC 3
2012 Machine performance
Oak Ridge NCCS
From: Robert Edwards
Date: January 31, 2012 9:41:57 PM CST
Subject: [sdac] 2012/13 USQCD Call for Proposals
This message is a Call for Proposals for awards of time on the USQCD
computer resources dedicated to lattice QCD and other lattice field
theories. These are the clusters at Fermilab and JLab, the
GPU-clusters at Fermilab and JLab, the BG/Q at BNL, and awards to
USQCD from the INCITE program.
In this call for proposals we expect to distribute about
297.2 M Jpsi-core hours on clusters at JLAB and FNAL
4.9 M GPU-hours on GPU clusters at JLAB and FNAL
737 K BG/Q node hours at BNL
5 M Jpsi-equivalent core-hours which we
expect to charge for disc and tape usage.
22.5 M Jpsi-core hours on XT5 at Oak Ridge (*)
27.0 M Jpsi-core hours on BG/P at ALCF (**)
50.0 M Jpsi-core hours of zero priority time on the
BG/P at ALCF (***)
(*) estimate based on CY2012; this is the second half of the CY2012
allocation and the first half of the estimated allocation for CY2013.
(**) available in the first few months of CY2013 only.
(***) estimate based on CY2011; allocation starts July 1, 2012 and extends
to June 30, 2013.
Further remarks on the nature of the INCITE award and additional
requirements for projects that apply for resources on leadership class
computers are given in section (iv).
All members of the USQCD Collaboration are eligible to submit proposals.
Those interested in joining the Collaboration should contact Paul Mackenzie
Let us begin with some important dates:
January 31: this Call for Proposals
March 9: proposals due
April 13: reports to proponents sent out
May 4/5: All Hands' Meeting at FNAL (ending ~5pm)
May 19: allocations announced
NOTE: INCITE allocations will already be announced
and will start on April 1.(see section (iv))
July 1: new allocations start
The Scientific Program Committee (SPC) will request some number of
presentations by the proponents of proposals at the All Hands'
Meeting. Proponents may in general request to make an oral
presentation of their proposals; however, the logistical constraints
of the meeting may preclude some number of talks.
The web site for the All Hands' Meeting is
The requests can be of three types:
A) requests for potentially large amounts of time on USQCD
dedicated resources and/or leadership class computers, to
support calculations of benefit for the whole USQCD
Collaboration and/or addressing critical scientific needs.
There is no minimum size to the request. However, small
requests will be not considered suitable for leadership
B) requests for medium amounts of time on USQCD dedicated
resources intended to support calculations in an early stage of
development which address, or have the potential to address,
scientific needs of the collaboration;
--- No maximum, but encouraged to be below 2.5 M
Jpsi-equivalent core-hours or less on clusters, or 100 K
GPU hours or less on GPU clusters. No suggested size for
BNL BG/Q requests ---
C) requests for exploratory calculations, such as those needed to
develop and/or benchmark code, acquire expertise on the use of
the machines, or to perform investigations of limited scope.
The amount of time used by such projects should not exceed
100 K Jpsi core-hours on clusters or 10 K GPU-hours on the
GPU-clusters. Requests for BG/Q at BNL should be handled on a
Requests of Type A and B must be made in writing to the Scientific
Program Committee and are subject to the policies spelled out below.
These proposals must also specify the amount of disk and tape storage
needed. Projects will be charged for new disks and tapes. How this
will be implemented is discussed in section (iii).
Requests of Type C should be made in an e-mail message to
Paul Mackenzie (email@example.com) for clusters at FNAL,
Robert Mawhinney (firstname.lastname@example.org) for the BG/Q at BNL,
Chip Watson (Chip.Watson@jlab.org) for clusters at JLAB.
Type C requests will be honored up to a total not exceeding 5% of
the available time on USQCD hardware. If the demand exceeds such limits,
the Scientific Program Committee will reconsider the procedures for access.
Collaboration members who wish to perform calculations on USQCD hardware
or on resources awarded to USQCD through the INCITE program can present
requests according to procedures specified below. The Scientific Program
Committee would like to handle requests and awards on leadership class
computers and cluster in the equivalent core-hours for the FNAL "Jpsi"
cluster. Requests on the GPU clusters will be handled in GPU hours, and
requests for the BG/Q will be handled in BG/Q hours. Conversion factors
for clusters and leadership class computers are given below.
As projects usually are not flexible enough to switch between running
on GPUs, BG/Q, and clusters, we refrain at present from introducing
conversion factors for this purpose.
- o -
The rest of this message deals with requests of Types A and B. It is
organized as follows:
i) policy directives regarding the usage of awarded resources;
ii) guidelines for the format of the proposals and deadline for
iii) procedures that will be followed to reach a consensus on the
research programs and the allocations;
iv) policies for handling awards on leadership-class machines
v) description of USQCD resources at Fermilab and JLAB
i) Policy directives.
1) This Call for Proposals is for calculations that will further the
physics goals of the USQCD Collaboration, as stated in the proposals
for funding submitted to the DOE (see http://www.usqcd.org/), and have
the potential of benefiting additional research projects by members of
the Collaboration. In particular, the scientific goals are described
in the science sections of the recent SciDAC proposals, which are
placed on the same web-site.
2) Proposals of Type A are for investigations of very large scale,
which may require a substantial fraction of the available resources.
Proposals of Type B are for investigations in an early stage of
development, and are medium to large scale which will require a
smaller amount of resources. There is no strict lower limit for
requests within Type A proposals, and there is no upper limit on
Type B Proposals. However, Type B requests for significantly more than 2.5
M Jpsi-equivalent core-hours on clusters or more than 100 K hours on
GPU-clusters, will receive significant scrutiny.
Proposals that request time on the leadership-class computers at Argonne
and Oak Ridge should be of Type A and should demonstrate that they
(i) can efficiently make use of large partitions of leadership class
computers, and (ii) will run more efficiently on leadership class computers
than on clusters.
It is hoped that on USQCD hardware about 80% of the available resources
will be allocated to proposals of Type A and about 15% to proposals of
Type B, with the rest being reserved for small allocations and
contingencies. Because our process is proposal-driven, however, we
cannot guarantee the 80-15-5 split.
3) All Type A and B proposals are expected to address the scientific
needs of the USQCD Collaboration. Proposals of Type A are for
investigations that benefit the whole USQCD Collaboration. Thus it is
expected that the calculations will either produce data, such as
lattice gauge fields or quark propagators, that can be used by the
entire Collaboration, or that the calculations produce physics results
listed among the Collaboration's strategic goals.
Accordingly, proponents planning to generate multi-purpose data must
describe in their proposal what data will be made available to the whole
Collaboration, and how soon, and specify clearly what physics analyses
they would like to perform in an "exclusive manner" on these data (see
below), and the expected time to complete them.
Similarly, proponents planning important physics analyses should explain
how the proposed work meets our strategic goals and how its results
would interest the broader physics community.
Projects generating multi-purpose data are clear candidates to use
USQCD's award(s) on leadership-class computers. Therefore, these
proposals must provide additional information on several fronts:
demonstrate the potential to be of broad benefit, for example
by providing a list of other projects that would use the shared
data, or how the strategic scientific needs of USQCD are addressed;
present a roadmap for future planning, presenting, for example,
criteria for deciding when to stop with one ensemble and start
discuss how they would cope with a substantial increase in
allocated resources, from the portability of the code and
storage needed to the availability of competent personnel
to carry out the running;
Some projects carrying out strategic analyses are candidates for running
on the leadership-class machines. They should provide the same information
4) Proposals of Type B are not required to share data, although if
they do so it is a plus. Type B proposals may also be
scientifically valuable even if not closely aligned with USQCD goals.
In that case the proposal should contain a clear discussion of the
physics motivations. If appropriate, Type B proposals may discuss
data-sharing and strategic importance as in the case of Type A
5) The data that will be made available to the whole Collaboration will
have to be released promptly. "Promptly" should be interpreted with
common sense. Lattice gauge fields and propagators do not have to be
released as they are produced, especially if the group is still testing
the production environment. On the other hand, it is not considered
reasonable to delay release of, say, 444 files, just because the last 56
will not be available for a few months.
After a period during which such data will remain for the exclusive use
of the members of the USQCD Collaboration, and possibly of members of
other collaborations under reciprocal agreements, the data will be made
available worldwide as decided by the Executive Committee.
6) The USQCD Collaboration recognizes that the production of shared data
will generally entail a substantial amount of work by the investigators
generating the data. They should therefore be given priority in
analyzing the data, particularly for their principal physics interests.
Thus, proponents are encouraged to outline a set of physics analyses that
they would like to carry out with these data in an exclusive manner and
the amount of time that they would like to reserve to themselves to
complete such calculations.
When using the shared data, all other members of the USQCD collaboration
agree to respect such exclusivity. Thus, they shall refrain from using
the data to reproduce the reserved or closely similar analyses. In its
evaluation of the proposals the Scientific Program Committee will in
particular examine the requests for exclusive use of the data and will
ask the proposers to revise it in case the request was found too broad or
excessive in any other form. Once an accepted proposal has been posted
on the Collaboration website, it should be deemed by all parties that the
request for exclusive use has been accepted by the Scientific Program
Committee. Any dispute that may arise in regards to the usage of such
data will have to be directed to the Scientific Program Committee for
resolution and all members of the Collaboration should abide by the
decisions of this Committee.
7) Usage of the USQCD software, developed under our SciDAC grants, is
recommended, but not required. USQCD software is designed to be
efficient and portable, and its development leverages efforts throughout
the Collaboration. If you use this software, the SPC can be confident
that your project can use USQCD resources efficiently. Software
developed outside the collaboration must be documented to show that it
performs efficiently on its target platform(s). Information on
portability is welcome, but not mandatory.
8) The investigators whose proposals have been selected by the Scientific
Program Committee for a possible award of USQCD resources shall agree to
have their proposals posted on a password protected website, available
only to our Collaboration, for consideration during the All Hands'
9) The investigators receiving a Type A allocation of time following
this Call for Proposals must maintain a public web page that
reasonably documents their plans, progress, and the availability of
data. These pages should contain information that funding agencies
and review panels can use to determine whether USQCD is a well-run
organization. The public web page need not contain unpublished
scientific results, or other sensitive information.
The SPC will not accept new proposals from old projects that still have
no web page. Please communicate the URL to email@example.com
ii) Format of the proposals and deadline for submission.
The proposals should contain a title page with title, abstract and the
listing of all participating investigators. The body, including
bibliography and embedded figures, should not exceed 12 pages in length
for requests of Type A, and 10 pages in length for requests of Type B,
with font size of 11pt or larger. If necessary, further figures, with
captions but without text, can be appended, for a maximum of 8 additional
pages. CVs, publication lists and similar personal information are not
requested and should not be submitted. Title page, proposal body and
optional appended figures should be submitted as a single pdf file, in an
attachment to an e-mail message sent to firstname.lastname@example.org
The deadline for receipt of the proposals is Friday, March 9, 2012.
The last sentence of the abstract must state the total amount of
computer time in Jpsi-equivalent core-hours (see below for
conversions), for GPU-clusters in GPU-hours, and in BG/Q hours for
that system at BNL. Proposals lacking this information will be
returned without review (but will be reviewed if the corrected
proposal is returned quickly and without other changes).
The body of the proposal should contain the following information,
if possible in the order below:
1) The physics goals of the calculation.
2) The computational strategy, including such details as gauge and
fermionic actions, parameters, computational methods.
3) The software used, including a description of the main algorithms
and the code base employed. If you use USQCD software, it is not
necessary to document performance in the proposal. If you use your own
code base, then the proposal should provide enough information to show
that it performs efficiently on its target platform(s). Information on
portability is welcome, but not mandatory. As feedback for the software
development team, proposals may include an explanation of deficiencies
of the USQCD software for carrying out the proposed work.
4) The amount of resources requested in Jpsi-equivalent core-hours or
GPU hours. Here one should also state which machine is most desirable
and why, and whether it is feasible or desirable to run some parts of
the proposed work on one machine, and other parts on another. If
relevant, proposals of Type A should indicate longer-term computing
The Scientific Program Committee will use the following table to convert:
1 J/psi core-hour = 1 Jpsi core-hour
1 Ds core-hour = 1.33 Jpsi core-hour
1 7n core-hour = 0.77 Jpsi core-hour
1 9q core-hour = 2.2 Jpsi core-hour
1 10q core-hour = 2.3 Jpsi core-hour
1 9g/FNAL(GPU) hour = 1 GPU-hour
1 BG/P core-hour = 0.54 Jpsi core-hour
1 XT5 core-hour = 0.50 Jpsi core-hour
The above numbers are based on the average of asqtad and DWF fermion
inverters. In the case of XT5 we used the average of asqtad (HISQ) and
clover inverters. See http://lqcd.fnal.gov/performance.html for
The total request(s) on clusters and GPUs should also be specified
in the last sentence of the proposal's abstract (see above). For example,
the measured anisotropic Clover inverter performance, on
a 24^3 x 128 lattice, multi-GPU running, per GPU:
In addition to CPU, proposals must specify how much mass storage is
needed. The resources section of the proposal should state how much
existing storage is in use, and how much new storage is needed, for disk
and tape, in Tbytes. In addition, please also restate the storage request
in Jpsi-equivalent core-hours, using the following conversion factor, which
reflect the current replacement costs for disk storage and tapes:
1 Tbyte disk = 30 K Jpsi-equivalent core-hour
1 Tbyte tape = 3 K Jpsi-equivalent core-hour
Projects using disk storage will be charged 25% of these costs every
three months. Projects will be charged for tape usage when a file is written
at the full cost of tape storage; when tape files are deleted, they will
receive a 40% refund of the charge.
Proposals should discuss whether these files will be used by one, a few,
or several project(s). The cost for files (e.g., gauge configurations)
that are used by several projects will borne by USQCD and not a specific
physics project. The charge for files used by a single project will be
deducted from the computing allocation: projects are thus encouraged to
figure out whether it is more cost-effective to store or re-compute a
file. If a few (2-3) projects share a file, they will share the
5) If relevant, what data will be made available to the entire
Collaboration, and the schedule for sharing it.
6) What calculations the investigators would like to perform in an
"exclusive manner" (see above in the section on policy directives),
and for how long they would like to reserve to themselves this
iii) Procedure for the awards.
The Scientific Program Committee will receive proposals until the deadline
of Friday, March 9, 2012.
Proposals not stating the total request in the last sentence of the
abstract will be returned without review (but will be reviewed if the
corrected proposal is returned quickly and without other changes).
Proposals that are considered meritorious and conforming to the goals of
the Collaboration will be posted on the web at http://www.usqcd.org/,
in the Collaboration's password-protected area. Proposals recommended
for awards in previous years can be found there too.
The Scientific Program Committee (SPC) will make a preliminary
assessment of the proposals. On April 16, 2012, the SPC will send
a report to the proponents raising any concerns about the proposal.
The proposals will be presented and discussed at the All Hands' Meeting,
May 4-5, 2012, at FNAL; see however,
---- section (iv) for special treatment of INCITE proposals---
Following the All Hands' Meeting the SPC will determine a set of
recommendations on the awards. The quality of the initial proposal, the
proponents' response to concerns raised in the written report, and the
views of the Collaboration expressed at the All Hands' Meeting will all
influence the outcome. The SPC will send its recommendations to the
Executive Committee shortly after the All Hands' Meeting, and inform the
proponents once the recommendations have been accepted by the Executive
Committee. The successful proposals and the size of their awards will be
posted on the web.
The new USQCD allocations will commence July 1, 2012.
Scientific publications describing calculations carried out with these
awards should acknowledge the use of USQCD resources, by including the
following sentence in the Acknowledgments:
"Computations for this work were carried out in part on facilities of
the USQCD Collaboration, which are funded by the Office of Science of
the U.S. Department of Energy."
Projects whose sole source of computing is USQCD should omit the phrase
iv) INCITE award CY2012/2013 and zero priority time at Argonne
Since 2007, USQCD policy has been to apply as a Collaboration for time
on the "leadership-class" computers, installed at Argonne and Oak
Ridge National Laboratories, and allocated through the DOE's INCITE
Program (see http://hpc.science.doe.gov/). The first successful
three-year INCITE grant period ended 12/2010. A new three-year grant
proposal has been successful and received funding in CY2011.
For CY2012 USQCD was awarded 50 M BG/P core-hours on the BG/P at
Argonne and 45 M XT5 core-hours on the Cray XT5 at Oak Ridge. We
expect to receive a similar allocation in CY2013. The Oak Ridge
facility does not provide a zero-priority queue.
In accordance with the usage pattern we have seen during the last
years at ORNL, we will allocate the second of half of the CY2012
INCITE allocation at ORNL from 06/2012 to 12/2012. We will also
distribute the first half of the expected CY2013 allocation in the
01/2013 to 06/2013 time-frame.
Again, in accordance with observed usage patterns, we will distribute
the entire regular INCITE at ANL that becomes available in
01/13. However, we expect this time to be consumed quickly - in the
first 2 months of the year. Thus, there is no regular INCITE time at
ANL available later in the year.
In addition we expect to receive in CY2012 and CY2013 "zero priority
time" on the BG/P at Argonne. Based upon previous usage and
availability, we will distribute zero priority time starting 07/2012
and ending in 06/2013. Should there turn out to be more zero-priority
time available than has been estimated, the SPC will readjust
allocations during the allocation year (07/2012 to 06/2013).
The usage of the INCITE allocations should be monitored by all PIs of
INCITE projects on the USQCD WEB-page:
v) USQCD computing resources.
The Scientific Program Committee will allocate 7200 hours/year to
Type A and Type B proposals. Of the 8766 hours in an average year the
facilities are supposed to provide 8000 hours of uptime. We then reserve
400 hours (i.e., 5%) for each host laboratory's own use, and another 400
hours for Type C proposals and contingencies.
10% of a 1024 node BG/Q rack
16 core/node, up to 4 threads per core
16 GB memory/node
10% of the rack is donated to USQCD.
total: 7200*1024*0.10 = 737 K BG/Q node-hours
856 node cluster ("J/Psi")
856 quad-core, dual-socket 2.1 GHz Opteron nodes
(6848 total cpu cores available)
8 GB memory/node
DDR Infiniband network
88 GB local scratch disk/node
total: 7200*6848*1 = 49.3 M Jpsi-equivalent core-hours
1 J/Psi node-hour = 8 Jpsi-equivalent core-hour
421 node cluster ("Ds")
32 cores per node
64 GB memory/node
QDR Infiniband network
1 Ds core-hour = 1.33 Jpsi-equivalent core-hour
1 Ds node-hour = 43.56 Jpsi-equivalent core-hours
total: 7200*421*43.56= 132 M Jpsi-equivalent core-hours
76 node GPU cluster ("Dsg")
2 quad-core Intel E5630 per node
48 GB memory/node
2 GPUs NVIDIA M2050 (Fermi Tesla) per node
(152 total GPUs available)
Full QDR infiniband (no oversubscription). Suitable for large
GPU memory (ECC on) 2.7 GB / GPU
total: 7200*152= 1094 K GPU-hours
These clusters will share about 632 TBytes of disk space in Lustre
file systems. The cluster will have access to 815 TByte of tape
storage, with up to 200 TBytes of new storage.
For further information see http://www.usqcd.org/fnal/
320 node Infiniband cluster ("9q")
320 quad-core, dual-processor 2.4 GHz Intel Nehalem
24 GB memory/node, QDR IB fabric in partitions of up to 128 nodes
total: 7200*320*8*2.2 = 40.5 M Jpsi-equivalent core-hours
1 9q core-hour = 2.2 Jpsi-equivalent core-hours
192 node Infiniband cluster ("10q")
192 quad-core, dual-processor 2.53 GHz Intel Westmere
24 GB memory/node, QDR IB fabric in partitions of 32 nodes
total: 7200*192*8*2.3 = 25.4 M Jpsi-equivalent core-hours
1 10q core-hour = 2.3 Jpsi-equivalent core-hours
130-200 node Infiniband cluster (preliminary rough guess)
200 Intel Sandy Bridge 8 core dual processor, 32 GB memory OR
130 AMD Interlagos 16 core quad processor, 64 GB memory
QDR network card, will full bi-sectional bandwidth network fabric
(competitive procurement underway at the time of this call,
results will be known by the time of allocation award)
rough estimate using the Intel solution, assuming SB core
is similar to Westmere core = 2.3 Jpsi cores
total: 200*16*2.3*7200 ~= 50M Jpsi core hours (tbd)
480 node GPU cluster at JLab:
36 node cluster equipped with 4GPUs NVIDIA C2050 (Fermi Tesla)
42 node cluster equipped with 4GPUs GTX-580 (Fermi gaming card)
46 node cluster equipped with 4GPUs GTX-480 (Fermi gaming card)
32 node cluster equipped with 1GPU GTX-285 (Older cards, before Fermi)
total: 7200*500 = 3802 K GPU hours
Further details and comments on the JLab GPU clusters:
1) 144 ECC Fermi Tesla, 2.7 GB memory with ECC on, configured as
36 quad C2050 or M2050, a set of 8 nodes with dual rail QDR, and a set
of 28 nodes with half-QDR
2) 168 GTX580 gaming cards, 1.25 GB memory, no ECC, configured as
42 quad nodes in varying size sets, with half SDR (best used for up to
3) 184 GTX480 gaming cards, 1.25 GB memory, no ECC, configured as
46 quad nodes in varying size sets, with half SDR (best used for up to
4) 32 GTX285 gaming cards, 2 GB memory, no ECC, configured as
single GPU nodes in a single set with full QDR (best used for jobs
needing more CPU cores and/or more host memory -- 24 GB host memory per
node = per GPU). These nodes are a subset of the "10q" nodes above.
Measured anisotropic Clover inverter performance, on
a 24^3 x 128 lattice, multi-GPU running, per GPU:
For further information see also http://lqcd.jlab.org/
At JLAB, the systems will have access to about 500 TBytes of disk
space. Tape access is also available.
SDAC mailing list
Back to Top