dCache/OSM
The Mass Storage Service (dCache/OSM)
Computer Center
The Mass Storage Service (dCache/OSM)
1. Before you use the mass storage service
2. About the software
3. About the hardware
4. When is it reasonable to use the mass storage ?
5. Access management
6. The migration directory
6.1 Command restrictions
6.2 This is forbidden
7. dccp - The user interface
7.1 Examples
7.2 Return code
7.3 This does not work
1. Before you use the mass storage service
Please read this page completely. In case of questions about the usage please contact uco-zn@desy.de.
Here you may query the status of dCache: dCache-status
Some dCache manuals
2. About the software
A disc cache system (dCache) is optimizing the access to the mass storage. It's status can be queried. Apart from the copy application dccp you may use it's library libdcap, that allows direct access to files on the disk cache. Part of it is a special nfs-server (pnfs) providing a network filesystem mounted on clients to access the files.
An Open Storage Manager (OSM) is performing the access to the files on tape transparently. It starts one client process per data transfer.
Reading from tape is triggered by accessing the file, if it is not available on disk cache.
Writing is triggered asynchronously by writing in a directory, which has been configured to write to tape. This configuration means also, that files - which are already written to tape - might be purged from disk if not enough disk space is left. Selection of tapes is done automatically by defined group (again defined per directory).
Unique and difficult to recreate data can have a second copy on a different tape, again this is defined per directory.
3. About the hardware
The mass storage device is a robot with around 4900 slots for LTO-cartridges, from which 4000 are used for dCache tape backend.
The mass storage uses 5 tape drives of type LTO6 and 2 tape drives of type LTO4.
Native LTO4 tape capacity for uncompressed data: 780 GB
Native LTO6 tape capacity for uncompressed data: 2.4 TB
4. When is it reasonable to use the mass storage ?
You should use the mass storage only for files not smaller than 220 MB
The transfer of small files to tape results in
* start/stop operation for the tape drives
* dramatic loss of transfer speed
* high stress on tape material and on tape drives
* lower reliability
The maximum file size of 1TB is due to the some technical reasons. Currently it is not recommended to write that big files: 1000000MB/150MB/s~2hours, the client would always receive an error as its time-out is around 1h although the file might be written correctly.
Keep in mind that reading transfer rate is 160 MB/s, locating the file on tape takes about 1-80s, mounting the tape needs round about 30s, so transfer time for a file less than 1.6GB will mostly be from tape overhead.
If possible you should avoid to use it on a machine which does not have the file you want to copy to the mass storage on it's local file system.
Otherwise too much network traffic will be produced.
5. Access management
The access to the mass storage is organized groupwise:
The user has no influence on the particular tape his file is transferred to. Tapes are organized by groups assigned to subdirectories in the migration directory. It may be subdivided like a usual UNIX directory, but in general only main directories have different groups. You see these directories under the name /acs/... on any machine on which this service is available.
It may be useful to assign separate tape pools to certain subdirectories of a group. For example for data that belong to different periods of time or for data that are produced by different software releases or for data that represent simply backup versions of your current data. Keep in mind that with 2.4 TB Tapes the amount of data shouldn't be much smaller than a tape, otherwise we waste slots in the robot. Tapes can be easier reused when you delete all data belonging to one pool at a certain moment, or we can temporarily place tapes of a certain pool outside the robot in case we run out of empty tape slots.
In case you need to store data that would be unreproducible in the rare case of tape corruption we can arrange for a second copy to a different tape-group. Please contact uco when this is needed.
On which machine is the service available?
On any machine that has setup a link named /acs. E.g. all farm machines and wgs. For data transfers to/from other (external) sites you should use the machine transfer.zeuthen.desy.de
Please send requests to setup the service on a certain machine, to setup /acs for your group or to setup tape pools to uco-zn@desy.de .
6. The migration directory
The migration directory is implemented as a special filesystem (pnfs) by means of the pnfs-server. It shows the names and sizes of the files transferred to the mass storage. It looks like a usual UNIX directory but is simulated by a database. Most of the UNIX commands work in the migration directory as usual (e.g. mkdir, ls, rm).
The migration directory is provided by automounter on all machines which are configured for dCache as
/pnfs/ifh.de/acs
We provide the link
/acs
pointing to it.
In addition pnfs maintains invisible for the user a mirror of the migration directory. It contains for each entry in the migration directory a short file (stubfile) with informations about the transferred files necessary to find them on the mass storage.
For each group there is a subdirectory under /pnfs/ifh.de/acs. It may be subdivided like a usual UNIX directory.
6.1 Command restrictions
These commands are not available:
mv [src-file] [dest-file]
If [src-file] OR [dest-file] is part of the migration directory tree
use instead: dccp and rm (see example below).
note: If [src-file] AND [dest-file] are part of the migration directory tree "mv" is available,but it works safely only if the path of src and dest are the same.
cp
use dccp instead
more
The files are not readable, in a later version maybe you will get the top of the file.
Error messages are not always clear and may be even misleading.
6.2 This is forbidden in the migration directory:
mv [dir-name] ../[any-dir]
To move a subdirectory into another branch of the migration directory tree may cause problems, but no error messages are generated.
To move a subdirectory within the same directory (simple rename) is safe.
This has no affect for the migration directory:
sticky bit
The files you copy into the migration directory will have your current unix group. In case you want these files to have a unix group other than your unix login group you can use since 21.10.99 the command newgrp.
7. dccp - The User Interface
It is available for LINUX. The command dccp performs the file transfer from a usual UNIX directory into the migration directory (the mass storage) and vice versa. The syntax is (see also here ):
dccp [options] file destination
In one call only one file may be copied. The destination file-name has to be unique. The transferred data remain on its former source place like you expect it from the command cp.
7.1 Examples:
To simplify the access to the migration directory symbolic links can be used, e.g.:
ln -s /net/pnfs/usr/acs/myGroupdir/mySubdir $HOME/migrated
To move a file to the mass storage:
dccp longfile $HOME/migrated
or
dccp longfile $HOME/migrated/newname
rm longfile
To transfer a file back from the mass storage to your own directory:
cd any_real_unix_dir
dccp $HOME/migrated/longfile .
7.2 You must check the return code
After a successful data transfer dccp returns 0. It's not sufficient to check the filesizes of source and destination to make sure your data transfers finished successfully. It's absolutely necessary to check the return code (e.g. $? in /bin/sh).
Problems have been seen when transferring very big files (>100GB). While the transfer succeeds, calculating the checksum takes longer then the default time out and dccp
returns an error.
Still the file is transferred correctly, you can enlarge the time out with dccp -C 3600 ...
.
7.3 This does not work
In following cases the error messages are not always clear and may be misleading:
dccp *long* /net/pnfs/usr/acs/myGroupdir/mySubdir
does not work if several files match the pattern *long*, because only one single file can be migrated or reloaded by one command dccp.
dccp newfile /net/pnfs/usr/acs/myGroupdir/mySubdir/existing_file
You can not overwrite an existing file in the migration directory. If a file is obsolete you need to remove it first.