rosette128px1

Accessing the Fermilab Tape Storage System

The interface to the Fermilab tape storage system is via Enstore. User documentation includes the Enstore/dCache User's Guide.

Each allocation year, projects will be assigned tape quota as set by the USQCD Software Program Committee. The namespace of Enstore files in your project area appears as a UNIX file system mounted at directory /pnfs/lqcd/projectName where projectName is the charge name assigned to your project. Enstore and the /pnfs/lqcd area are only accessible from cluster login head nodes such as lq.fnal.gov. PNFS is also available on our lqio.fnal.gov data movement nodes. Hence, files within the Enstore system must first be staged to disk using the dccp command before they can be copied to the /scratch area of the workers and vice versa

rosette128px1

NOTE: We strongly encourage users to use our lqio.fnal.gov data movement nodes for moving data between tape, disk and remote locations. lqio.fnal.gov has a 100GbE network interface and provides access to /lustre1 besides tape.This node have much faster network connections compared to the cluster login node thus allowing better throughput. And performing your transfers on this data movement nodes does not bog down other users on the cluster login head node.

Managing /pnfs/lqcd Project Area

  • /pnfs/lqcd/projectName looks like a standard directory, but you have to use a special command dccp to copy files in and out of this area. However, you can use file and directory manipulation commands, such as find, stat, mv, rm, mkdir, rmdir, chmod, chgrp, chown, etc. to locate tape files, print an inode's content, rename and delete files, manipulate subdirectories, change permissions, and so forth. Using standard UNIX commands means scripts to manage tape files are almost unchanged from scripts that manage disk files.
  • dccp has the semantics of cp, except that wildcards are not allowed. So, you will need to script the sequence of dccp commands to copy your files into /pnfs/lqcd/projectName.
  • In dccp commands, the source and destination directories will determine whether files are copied to tape or read from tape.
    • Commands like "dccp file /pnfs/lqcd/projectName/subdir/file" will copy a file to tape.
    • Commands like "dccp /pnfs/lqcd/projectName/subdir/file /destdir/file" will copy a file from tape.
  • dccp uses a disk layer (dCache) to cache files on their way to or from tape. So when writing to tape, the file is actually stored on a dCache disk, and is subsequently written to tape as soon as practical. A dccp command writing to tape will return success as soon as the file has been transfered to disk. When reading from tape, dccp will not return until the tape has been mounted by the tape robot and the file read to disk
  • If you are planning to write a lot of small files to tape we highly recommend compressing the files into a single tarball, hence a single file which makes efficient use of tape. If in doubt please do not hesitate to email us with your question(s) regarding writing data to tape.

By default, your files will be grouped together onto tapes that will hold only ProjectName "file family" files. Enstore file families are explained in the next section below. When you delete a file (with rm), only the metadata is removed, but the file remains on tape. Once all of the files are deleted from a tape, you may request that the tape be recycled. Your allocation will be charged when the first file on a tape is written and you will get a refund if a tape is recycled. If you would like to have different groups of files - some which are archival and never deleted and others that can be deleted and their containing tapes recycled - email us at lqcd-admin@fnal.gov and we will set you up with multiple file families. File families are tied to subdirectories.

Please visit the Frequently Asked Questions page for our responses to common user questions regarding best practices and access methods to the Fermilab Mass Storage system.

Enstore file families

The Enstore system contains more metadata for tape files beyond the standard UNIX inode information for files. The command below displays the extra information for the projectName area:

$ enstore pnfs --tags /pnfs/lqcd/projectName
.(tag)(library) = 9940
.(tag)(file_family) = lqcd
.(tag)(file_family_wrapper) = cpio_odc
.(tag)(file_family_width) = 1
.(tag)(storage_group) = lqcd
-rw-rw-r-- 11 11072 9540 4 Feb 11 14:29 /pnfs/lqcd/projectName/.(tag)(library)
-rw-rw-r-- 11 11072 9540 4 Feb 11 14:30 /pnfs/lqcd/projectName/.(tag)(file_family)
-rw-rw-r-- 11 11072 9540 8 Feb 11 14:31 /pnfs/lqcd/projectName/.(tag)(file_family_wrapper)
-rw-rw-r-- 11 11072 9540 1 Feb 11 14:31 /pnfs/lqcd/projectName/.(tag)(file_family_width)
-rw-rw-r-- 11 root root 4 Feb 11 14:31 /pnfs/lqcd/projectName/.(tag)(storage_group)

The file_family tag offers a convenient way for a group to organize their data on tape. The file_family specifies the logical name for a set of tapes. File families may be specified on a per directory basis and are, by default, inherited from the parent directory when a new directory is created. Enstore maintains files belonging to separate file families on separate sets of tape cartridges. This feature facilitates shelving or the removal of little used data sets from the robot without affecting other data sets.

rosette128px1
Fermi National Accelerator Laboratory
Managed by Fermi Research Alliance, LLC
for the U.S. Department of Energy Office of Science
rosette128px1
Security, Privacy, Legal

 

 

 

 

peaceOpt2 item2 footerFermilabLogo