Table of Contents
The Enstore system provides for access to the data by user/client applications which are distributed across an IP network. It supports tape drives attached locally to the users' computer, as well as those remotely accessible over the network.
The Enstore system provides resource management of the available tape drives such that, for example, logging of data from the data acquisition systems can be given guaranteed access to the tape bandwidth whatever other user accesses are being requested. Enstore is designed to be used by Fermilab experiments' data acquisition, data processing and analysis systems. Well defined interfaces will be provided to these data handling systems to allow them to easily use the services provided. The writing and reading of tapes must therefore be reliable and efficient, and the system must be robust enough to support this critical application without compromising data taking. Enstore's goal is to provide a system that can be extended as needed for the experiments actual data taking needs, as well as be easily maintainable for the duration of several data taking runs.
Enstore is based on a client-server model that allows hot swapping of hardware components and dynamic software configuration, is platform independent, runs on heterogeneous environments and is easily extendable. Most of the operations are transparent to the user. System performance is monitored and fine tunable. A great deal of care has been taken to ensure that it is able to prevent or to recover from a worst case scenario. The system has layers around it to customize and address problems as they occur. When possible, these layers are expected to use already existing components (e.g. FTT, pnfs).
The Enstore system is designed to provide for the needed Run II data access throughput requirements within the budget assigned. The system software is layered and accessible to the Run II developers such that needed modifications can be made in a timely manner to meet the needs of commissioning and running of the Run II detectors.
Enstore is designed to support "lights out" operation of the Run II automated tape library systems. To this end, the design is targeted towards requiring operator intervention at no more than 8 hour intervals - for example, import/export requests are queued and need only be handled within the daytime operator shifts. Careful attention is paid to error reporting, handling and recovery in order to require the minimal possible load on the operations and support staff.
To summarize, Enstore provides the following features:
D0, and most specifically, SAM, has been very helpful in setting the direction for what is needed from Enstore. We believe a close and working collaboration has been developed in which both SAM and Enstore have profited. We appreciate the early, and sometimes tedious and painful, testing the SAM group has done on Enstore.We have been working with D0 to try provide a storage system that fulfills their needs. We have chosen to first present what Enstore provides and then what D0 requires and describe how Enstore fulfills it. This ordering could have been reversed - there has been great synergy between the two efforts.
The system is written in python, a scripting language that has advanced object-oriented features. Python provides a sound environment for quick turn-around and a seamless integration/migration path to fully compiled languages, such as C and C++, if there is a demand for even better performance.
Enstore has four major kinds of software components:
These software components, as well as hardware components, are shown schematically in the following system context diagram. Hardware components are connected via IP. Great care has been taken to ensure that the system will function well under extreme load conditions. By design, there is no preset limit on the number of concurrent user computers nor on the number of physical media libraries or drives. The system is only limited by the availability of physical resources. We control all of the source code for the system except for that of pnfs (which is a well supported product from DESY).
Like tcp, the system is architected with distributed and peer-to-peer reliability. Each request originating from the encp program is branded with a unique ID. Encp retries under well-defined circumstances, issuing an equivalent request with a new unique ID. The system can instruct encp to retry if it needs to back out of an operation.
The DESY pnfs package implements an nfs-v2 daemon and mount daemon. These daemons do not actually serve a file system, but, instead make a collection of database entries looks like a file system, and provide control information for the system. Each file that is created in pnfs has 8 layers that Enstore uses to store metadata information about the file transfers. Normal UNIX permissions and administered export points are used to prevent unauthorized access to the name space.
To inspect files, users mount their portion of the pnfs file system on their own computers, and interact with it using the native operating system utilities. For example, users can ls, stat, mv, rm or touch existing "files", but are given errors on attempts to read or write the content of the files. Users can also mkdir and rmdir, and ln files. Hard links should be used to ensure all the metadata information is linked; symbolic links will not give the user what he naively expects.
There are also some special pnfs files which act as normal UNIX files. Administrators can write data to these files and the users can read from them. These files are the exception rather than the rule. Enstore plans on using them to distribute service information that everyone, who has pnfs mounted, can read.
Enstore uses pnfs for three different kinds of access and information:
*** These tests will have to be repeated *** under the final hardware configuration, but there is no indication of any problems.
$ upd list pnfs DATABASE=/ftp/upsdb Product=pnfs Version=v3_1_3a-f4 Flavor=Linux+2 Qualifiers="" Chain=currentThe UPD version can be decoded as follows: "v3_1_3a" is the DESY version of pnfs, and the "-f4" signifies the 4th Fermi "release". None of the DESY code is modified - Fermilab only adds its UPS packaging framework and some local installation instructions. All fixes, changes, or updates to pnfs, will always come from DESY. DESY has allowed Fermilab full access to the pnfs source code, and as such, we could, in principle, solve problems if DESY were unable to continue their support. It is expected that there will be only a few pnfs servers at Fermilab. To date, pnfs servers have been installed on 3 Linux nodes without any difficulty. Each time, a set of installation instructions has been improved; however the pnfs server installation is still not completely automatic. The Fermilab installation instructions are distributed along with the UPD product. On the node that is serving pnfs, pnfs takes over the normal function of exporting nfs. Otherwise the machine is general purpose. To be explicit, the only 2 processes pnfs server machine can not run are rpc.mountd and rpc.nfsd. It runs the pnfs versions of these instead. These processes are only concerned with exporting pnfs. For example, Rip6 is the current (Jan 99) Enstore pnfs server. Here is its /etc/fstab
rip6$ cat /etc/fstab /dev/sda6 / ext2 defaults 1 1 /dev/sda5 swap swap defaults 0 0 /dev/sdc1 /rip6a ext2 defaults,grpid 2 1 /dev/fd0 /mnt/floppy ext2 noauto 0 0 none /proc proc defaults 0 0 rip8:/fnal /fnal nfs soft,rsize=8192,wsize=8192 0 0 rip8:/home /home nfs soft,rsize=8192,wsize=8192 0 0 rip8:/usr/local /usr/local nfs soft,rsize=8192,wsize=8192 0 0 localhost:/fs /pnfs/fs nfs noauto,intr,bg,hard,rw,noac 0 0 rip6:/grau-ait /pnfs/grau/ait nfs noauto,user,intr,bg,hard,rw,noac 0 0 rip6:/grau-dlt /pnfs/grau/dlt nfs noauto,user,intr,bg,hard,rw,noac 0 0 rip6:/grau-mammoth /pnfs/grau/mammoth nfs noauto,user,intr,bg,hard,rw,noac 0 0 rip6:/stk-red20 /pnfs/stk/red20 nfs noauto,user,intr,bg,hard,rw,noac 0 0 rip6:/stk-red50 /pnfs/stk/red50 nfs noauto,user,intr,bg,hard,rw,noac 0 0 rip6:/rip6disk1 /pnfs/rip6 nfs noauto,user,intr,bg,hard,rw,noac 0 0
As you can see, rip6 is nfs mounting 3 disks from rip8 and mounting the pnfs disks it is exporting as well as the local disks. There are also numerous Enstore processes running on rip6, for example:
USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND bakken 3280 0.0 7.3 20240 9448 ? S 00:17 0:14 python /home/bakken/enstore/src/configuration_server.py bakken 3334 0.0 4.9 20112 6356 ? S 00:17 0:01 python /home/bakken/enstore/src/log_server.py bakken 3366 0.0 5.7 37980 7352 ? S 00:17 0:02 python /home/bakken/enstore/src/volume_clerk.py bakken 3398 0.0 5.3 37760 6832 ? S 00:17 0:01 python /home/bakken/enstore/src/file_clerk.py bakken 3433 0.0 5.1 36472 6560 ? S 00:17 0:00 python /home/bakken/enstore/src/media_changer.py bakken 3465 0.0 8.8 33456 11300 ? S 00:17 0:01 python /home/bakken/enstore/src/mover.new.py bakken 3515 0.0 0.3 1140 492 ? S 00:18 0:00 db_checkpoint -h bakken 3520 0.0 0.3 1612 420 ? S 00:18 0:00 db_deadlock -h bakken 3523 0.0 5.3 30508 6808 ? S 00:18 0:01 python /home/bakken/enstore/src/alarm_server.py bakken 3673 0.0 8.1 36760 10432 ? S 00:20 0:34 python /home/bakken/enstore/src/inquisitor.py bakken 12178 0.0 0.8 1552 1048 p2 S 19:38 0:00 /bin/login -h willow fnal.gov -p bakken
The main point, often confused, is that the pnfs server node remains a general purpose and usable machine.
Permission to mount the pnfs namespace is granted using a mechanism similar to the normal Unix nfs export permission scheme. There are DESY commands (the pmount command) that make this entire process very simple.
Pnfs can be started automatically on boot-up. This allows other nodes to easily mount the namespaces after reboots.
Finally, it should be noted that a Run II pnfs server will need a SCSI RAID level 5 disk system for its databases. RAID level 5 is needed for redundancy and reliability. This is the system that DESY uses for their pnfs system.
*** Live Backups of database and recovery procedures *** - to be discussed during March trip to DESY. This has not been a priority yet.
rip6:/grau-ait /pnfs/grau/ait nfs user,intr,bg,hard,rw,noac 0 0The "intr,bg,hard,rw,noac" mount options should not be changed as they are needed for proper operation.
Pnfs filesystems in any way that other NFS filesystems can be mounted:
Pnfs supports automounting as one would expect. There is a general problem with automounting that the pnfs mountpoints exacerbate: automouting works fine if the mountpoint is only 1 level deep - but if one tries to mount deeper in a mounted tree, the automounter will not work properly. To circumvent this difficulty, one needs to employ a link gambit, provided by Ramon Pasetes of the OSS Department. The solution uses an intermediate link where the filesystem are mounted in the 1st level, a series of links that make it make the file system appear to be mounted as deep as it need be, and an export map to get the mountpoints to the client machines.
Here, as an example of this solution. The example is the current automounting maps that the OSS department are using for the Run II farm nodes, but applied to a test node called airedale.
Here is the auto.master entry for pnfs:
/pnfs /etc/auto.pnfs -hard,intr,noac
And here is the auto.pnfs map:
d0sam pcfarm9:/d0sam # enstore pcfarm9:/enstore # grau airedale:/Pnfs/grau ro [OSS uses node fnpca] grau-ait rip6:/grau-ait grau-dlt rip6:/grau-dlt grau-mammoth rip6:/grau-mammoth # rip6 rip6:/rip6disk1 # sam airedale:/Pnfs/sam ro [OSS uses node fnpca] sam-ait samson:/sam-ait sam-dlt samson:/sam-dlt sam-mammoth samson:/sam-mammoth sam-red20 samson:/sam-red20 sam-red50 samson:/sam-red50 samson samson:/samson # stk airedale:/Pnfs/stk ro [OSS uses node fnpca] stk-red20 rip6:/stk-red20 stk-red50 rip6:/stk-red50
Here is the /etc/export file: [OSS exports to the required nodes]
/Pnfs/grau airedale.fnal.gov /Pnfs/sam airedale.fnal.gov /Pnfs/stk airedale.fnal.gov
And finally, here are the intermediate links:
airedale# ls -alsFgR /Pnfs /Pnfs: total 5 1 drwxrwxr-x 5 root root 1024 Feb 8 11:05 ./ 1 drwxr-xr-x 30 root root 1024 Feb 8 10:38 ../ 1 drwxrwxr-x 2 root root 1024 Feb 8 10:39 grau/ 1 drwxrwxr-x 2 root root 1024 Feb 8 11:05 sam/ 1 drwxrwxr-x 2 root root 1024 Feb 8 11:27 stk/ /Pnfs/grau: total 2 1 drwxrwxr-x 2 root root 1024 Feb 8 10:39 ./ 1 drwxrwxr-x 5 root root 1024 Feb 8 11:05 ../ 0 lrwxrwxrwx 1 root root 14 Feb 8 10:39 ait -> /pnfs/grau-ait/ 0 lrwxrwxrwx 1 root root 14 Feb 8 10:39 dlt -> /pnfs/grau-dlt/ 0 lrwxrwxrwx 1 root root 18 Feb 8 10:39 mammoth -> /pnfs/grau-mammoth/ /Pnfs/sam: total 2 1 drwxrwxr-x 2 root root 1024 Feb 8 11:05 ./ 1 drwxrwxr-x 5 root root 1024 Feb 8 11:05 ../ 0 lrwxrwxrwx 1 root root 13 Feb 8 11:04 ait -> /pnfs/sam-ait/ 0 lrwxrwxrwx 1 root root 13 Feb 8 11:04 dlt -> /pnfs/sam-dlt/ 0 lrwxrwxrwx 1 root root 17 Feb 8 11:04 mammoth -> /pnfs/sam-mammoth/ 0 lrwxrwxrwx 1 root root 15 Feb 8 11:05 red20 -> /pnfs/sam-red20/ 0 lrwxrwxrwx 1 root root 15 Feb 8 11:05 red50 -> /pnfs/sam-red50/ /Pnfs/stk: total 2 1 drwxrwxr-x 2 root root 1024 Feb 8 11:27 ./ 1 drwxrwxr-x 5 root root 1024 Feb 8 11:05 ../ 0 lrwxrwxrwx 1 root root 15 Feb 8 11:26 red20 -> /pnfs/stk-red20/ 0 lrwxrwxrwx 1 root root 15 Feb 8 11:27 red50 -> /pnfs/stk-red50/
Finally, it should be noted that mounting the pnfs namespace does not restrict the node in any other way - it can import and mount any other file systems and run any tasks as it normally would.
Commonly used commands are:
FUNCTION | COMMAND | OUTPUT |
lists the online help | pcmd help | |
lists "important" info about the file *** Needs work to be fully functional *** |
pcmd info file | $ pcmd info M1 bfid="91184924000000L"; volume="flop309"; location_cookie="68608"; size="1252"; file_family="jon4"; filename="/pnfs/enstore/airedale/jon4/M1"; orig_name="/pnfs/enstore/airedale/jon4/M1"; map_file="/pnfs/enstore/volmap/jon4/flop309/000000068608"; pnfsid_file="00020000000000000050AE88"; pnfsid_map="00020000000000000050AEA0" |
lists the tags in the directory | pcmd tags directory | $ pcmd tags . .(tag)(library) = rip6 .(tag)(file_family) = jon-rip6 .(tag)(file_family_width) = 1 .(tag)(file_family_wrapper) = cpio_custom -rw-rw-r-- 1 bakken g023 4 Nov 16 21:24 /pnfs/rip6/.(tag)(library) -rw-rw-r-- 1 bakken g023 8 Nov 16 21:24 /pnfs/rip6/.(tag)(file_family) -rw-rw-r-- 1 bakken g023 1 Nov 16 21:24 /pnfs/rip6/.(tag)(file_family_width) -rw-rw-r-- 1 bakken g023 11 Jan 27 18:45 /pnfs/rip6/.(tag)(file_family_wrapper) |
sets/lists library tag to value (must have correct cwd) |
pcmd library [value] | $ pcmd library ait $ pcmd library xxx $ pcmd library xxx |
sets/lists file family tag to value (must have correct cwd) |
pcmd file_family [value] | $ pcmd file_family jon-ait-3 $ pcmd file_family xxx $ pcmd file_family xxx |
sets/lists file family width tag to value (must have correct cwd) |
pcmd file_family_width [value] | $ pcmd file_family_width 2 $ pcmd file_family_width 10 $ pcmd file_family_width 10 |
sets/lists file family wrapper tag to value (must have correct cwd) |
pcmd file_family_wrapper [value] | $ pcmd file_family_wrapper cpio_custom $ pcmd file_family_wrapper cpio_odc $ pcmd file_family_wrapper cpio_odc |
lists all the files on specified tape in volmap *** Needs work to be fully functional *** |
pcmd files volmap-tape | |
lists the volmap-tape for the specified volumename *** Needs work to be fully functional *** |
pcmd volume volumename | |
lists the bit file id of the file | pcmd bfid file | $ pcmd bfid testfile 91551931700000L |
*** lists the last parked location of the file parked feature is not implemented |
pcmd parked file | |
lists the debug info about the file transfer | pcmd debug file | |
lists the cross-reference info about the file | pcmd xref file | $ pcmd xref testfile CA2902 (tape label) '0000_000000000_0000132' (positioning info) 104857600 (file size) jon-ait-1 (file family) /pnfs/grau/ait/jon1/100MB.trand (original name) /pnfs/grau/ait/volmap/jon-ait-1/CA2902/... ... 0000_000000000_0000132 (volume map name) 0001000000000000000928D0 (pnfs id of file) 0001000000000000000928E0 (pnfs id of volume map file) |
does an ls on the named layer in the file | pcmd ls file [layer] | $ pcmd ls testfile 3 4 -rw-rw-r-- bakken g023 3692 Jan 5 00:55 ./.(use)(3)(testfile) |
lists the layer of the file it is easier to use pcmd bfid|parked|debug|xref commands |
pcmd {cat|more|less} file layer | |
lists the tag in the directory it is easier to use pcmd library|file_family|file_family_width commands |
pcmd {tagcat|tagmore|tagless} tag directory | |
lists whether Enstore is still accepting transfers | pcmd enstore_state | $ pcmd enstore_state Enstore enabled |
lists whether pnfs mount point is up *** Not fully functional yet *** |
pcmd pnfs_state mount-point | $ pcmd pnfs_state /pnfs/grau/ait Pnfs up |
Don't use these unless you know what you are doing:
FUNCTION | COMMAND | OUTPUT |
echos text to named layer of the file | pcmd echo text file layer | |
deletes (clears) named layer of the file | pcmd rm file layer | |
copies Unix file to named layer of file | pcmd cp unixfile file layer | |
copies Unix file to named layer of file | pcmd cp unixfile file layer | |
sets the size of the file | pcmd size file size | |
echos text to the named tag | pcmd tagecho text tagname | |
removes the tag (tricky, see DESY documents) | pcmd tagrm tag | |
sets io mode (can't clear it easily) | pcmd io file | |
Don't use these unless you can interpret the results:
FUNCTION | COMMAND | OUTPUT |
shows the pnfs id | pcmd id file | |
shows the showid information | pcmd showid id | |
shows the const information | pcmd const file | |
shows the filename | pcmd nameof id | |
shows the complete file path | pcmd path id | |
shows the parent | pcmd parent id | |
shows the counters | pcmd counters file | |
shows of the counters | pcmd counterN dbnum (must have cwd in pnfs |
|
shows the cursor | pcmd cursor file | |
shows the directory position | pcmd position file | |
shows the database information | pcmd database file | |
shows the database information | pcmd databaseN dbnum (must have cwd in pnfs) |
|
% encp [options] src_file dst_fileCurrently there is no wild-carding allowed, but this is a straight forward extension to encp.
FUNCTION | SWITCH | DEFAULTS | |
print short help message about using encp | --help | None | |
perform CRC check on the local user machine | --crc | CRC check is only performed on the mover computers | |
set the base priority = value | --pri=value | 1 | |
change the base priority by value after a period specified by the agetime switch | --delpri=value | 0 | |
specify the time period, in minutes, after which the base priority could change | --agetime=value | 0 (no aging of priority) | |
give the library manager a hint that more work is coming for the volume and it should not dismount the volume "too quickly" when this transfer is completed | --delayed_dismount | None (immediate dismount on completion) | |
turn on special status printing requested by D0 | --data_access_layer | D0 printing is off | |
change the amount of information printed about the transfer | --verbose=value | 0 (no printing) | |
list the active and pending transfers for the specified node | --queue nodename | None | |
create a new file family (width exactly 1) and copy files to this file family | --ephemeral | None | |
copied files to specified file family | --family value | None | |
specifies the hostname where the configuration server is running | --config_host=value | environmental variable, ENSTORE_CONFIG_HOST, set by the UPS setup command | |
specifies the port number that the configuration server responds to | --config_port=value | environmental variable, ENSTORE_CONFIG_PORT, set by the UPS setup command |
The data_access_layer option provides the output of encp in a format required by the D0 SAM system. The example output is below:
$ encp --data_access_layer 1GB.trand /pnfs/grau/ait/jon1/ INFILE=/rip8a/enstore/random/1GB.trand OUTFILE=/pnfs/grau/ait/jon1/1GB.trand FILESIZE=1073741824 LABEL=CA2901 DRIVE=/dev/rmt/tps2d2n TRANSFER_TIME=384.365273 SEEK_TIME=0.004476 MOUNT_TIME=24.688773 QWAIT_TIME=1.581097 TIME2NOW=434.628970 STATUS=ok 1 GB copied to CA2901 at user 2.4 MB/S (2.7 MB/S IO rate)This output is easily parsable and provides information about input and output files, file size, volume label, drive used to read/write data, transfer time, file seek time, volume mount time, wait time in the request queue, operation completion status, and total time since the invocation of encp till the end of the operation.
On reads from the HSM, encp scans all specified files and groups them according to which volume they are on, orders them according to location on a tape, and then submits all the file requests that are on one specific volume, reads all the files for that volume from the HSM, and then proceeds to the next volume. Encp processes the volumes in any order it chooses.
On writes to the HSM, encp processes each input file sequentially. Since the user must specify a single output directory, all input files must belong to the same file family, and hence could all go to the same tape (if possible). Encp sets a flag,"don't dismount the volume too quickly - there's more files coming for the same family", that the Mover uses to postpone the dismount and thereby avoid the extra times involved in the volume manipulations. Please note that there is no guarantee that all the files will go to one tape (there might not be room) or that they will be grouped together on the tape (there may be other writes to the same family that get intermixed).
Consider the following example (P=/pnfs/enstore/airedale) reading from the HSM:
The following files on flop301: ran-1, ran-2 ran-3, ran-4
The following files on flop302: ran-5, ran-6 ran-7, ran-8
The following files on flop302: ran-9, ran-10 ran-11, ran-12
Encp submits the requests for all files on flop301 and reads back those files and then does the same for flop302 and flop303.
Here is the output of an actual test:
$ encp $P/ran-1 $P/ran-2 $P/ran-3 $P/ran-4 \ $P/test2/ran-5 $P/test2/ran-6 $P/test2/ran-7 $P/test2/ran-8 \ $P/test3/ran-9 $P/test3/ran-10 $P/test3/ran-11 $P/test3/ran-12 . $P/test2/ran-5 -> ./ran-5 : 102400 bytes copied from flop302 at 0.19 MB/S requestor:bakken cum= 3.5 $P/test2/ran-6 -> ./ran-6 : 102400 bytes copied from flop302 at 0.42 MB/S requestor:bakken cum= 3.7 $P/test2/ran-7 -> ./ran-7 : 102400 bytes copied from flop302 at 0.46 MB/S requestor:bakken cum= 3.9 $P/test2/ran-8 -> ./ran-8 : 102400 bytes copied from flop302 at 0.46 MB/S requestor:bakken cum= 4.2 $P/test3/ran-9 -> ./ran-9 : 102400 bytes copied from flop303 at 0.12 MB/S requestor:bakken cum= 5.4 $P/test3/ran-10 -> ./ran-10 : 102400 bytes copied from flop303 at 0.49 MB/S requestor:bakken cum= 5.6 $P/test3/ran-11 -> ./ran-11 : 102400 bytes copied from flop303 at 0.45 MB/S requestor:bakken cum= 5.8 $P/test3/ran-12 -> ./ran-12 : 102400 bytes copied from flop303 at 0.45 MB/S requestor:bakken cum= 6.1 $P/ran-1 -> ./ran-1 : 102400 bytes copied from flop301 at 0.12 MB/S requestor:bakken cum= 7.4 $P/ran-2 -> ./ran-2 : 102400 bytes copied from flop301 at 0.45 MB/S requestor:bakken cum= 7.6 $P/ran-3 -> ./ran-3 : 102400 bytes copied from flop301 at 0.42 MB/S requestor:bakken cum= 7.8 $P/ran-4 -> ./ran-4 : 102400 bytes copied from flop301 at 0.45 MB/S requestor:bakken cum= 8.0
This is a straightforward and easy command, except for 2 complications:
Here's an example of how it is used:
$ encp --queue rip8.fnal.gov rip8.fnal.gov bakken /raid/1MB.trand /pnfs/grau/ait/jon1/1MB.trand P rip8.fnal.gov bakken /raid/1GB.trand /pnfs/grau/ait/jon2/1GB.trand M rip8.fnal.gov bakken /raid/1GB.trand /pnfs/grau/ait/jon1/1GB.trand M $ encp --queue rip4.fnal.gov rip4.fnal.gov bakken /raid/1MB.trand /pnfs/grau/ait/jon2/1MB.trand MThe 1st field in the output is the node name, the 2nd is the requester, the 3rd is the input filename, and the 4th is the output filename. The 5th and last field can have 2 values: "P" denotes a Pending transfer still in the Library Manager queues and "M" signifies a active transfer at a Mover.
STATUS | DESCRIPTION |
OK | Operation Completed Successfully |
KEYERROR | Not an existing reference key |
DOESNOTEXIST | Object (file name, etc.) does not exist |
NOMOVERS | No Movers to process request |
MOUNTFAILED | Mount of required volume failed |
DISMOUNTFAILED | Dismount of required volume failed |
MEDIA_IN_ANOTHER_DEVICE | Requested Media is in another Device |
MEDIAERROR | Bad Media |
USERERROR | User Error |
DRIVEERROR | Drive Error |
UNKNOWNMEDIATYPE | Unknown media type |
NOVOLUME | Volume does not exist |
NOACCESS | Volume marked as no access |
CONFLICT | Configuration conflict detected |
WRITE_NOTAPE | Requested volume was not found in the library |
WRITE_TAPEBUSY | Requested volume is in another drive. |
WRITE_DRIVEBUSY | A volume is already in the drive. |
WRITE_BADMOUNT | Mount failure or load operation failed. |
WRITE_BADSPACE | EOD cookie does not produce EOD. |
WRITE_ERROR | Error writing data block or file mark |
WRITE_EOT | Hit EOT while writing data block or file mark |
WRITE_NOBLANKS | No more blank volumes |
WRITE_MOVER_CRASH | Mover crash during write operation |
READ_NOTAPE | Requested volume was not found in the library |
READ_TAPEBUSY | Requested volume is in another drive |
READ_DRIVEBUSY | A volume is already in the drive |
READ_BADMOUNT | Mount failure or load operation failed |
READ_BADLOCATE | Failed space or initial CRC's don't match |
READ_ERROR | Error reading data block |
READ_COMP_CRC | CRC mismatch |
READ_EOT | Hit EOT when reading |
READ_EOD | Hit EOD when reading |
READ_UNLOAD | Error unloading volume |
READ_UNMOUNT | Error when unmounting volume |
READ_MOVER_CRASH | Mover crash during read operation |
Typically 2 encp products are available in UPD, one for general use and one explicitly tailored for D0/SAM. This tailorint is simply for SAM's convenience. There will only be one version in the future.
$ upd list -a encp DATABASE=/ftp/upsdb Product=encp Version=v0_11-sam Flavor=Linux+2 Qualifiers="" Chain=current Product=encp Version=v0_11 Flavor=Linux+2 Qualifiers="" Chain=""The installation procedure is straightforward:
$ upd install -G"-c" encp informational: beginning install of encp. informational: transferred /ftp/products/encp/v0_11/Linux+2/encp_v0_11_Linux+2 from fnkits.fnal.gov to /home/products/encp/v0_11 informational: transferred /ftp/products/encp/v0_11/Linux+2/encp_v0_11_Linux+2.table from fnkits.fnal.gov:/ to /home/products/upsdb/encp/v0_11.table.new informational: ups declare succeeded informational: ups declare succeededThe entire product consists of the encp binary, pcmd (a pnfs script described in the pnfs section), and some UPS tables. The encp binary is large since it is statically linked.
$ ls 179 Nov 24 09:49 .manifest.encp 2976514 Nov 24 09:24 encp* 1690 Nov 24 09:24 encp.table 9 Dec 2 15:20 enstore_variables.table -> rip.table 11313 Nov 24 09:24 pcmd* 398 Nov 24 09:24 rip.table 399 Nov 24 09:24 sam.tableTwo environmental variables, ENSTORE_CONFIG_PORT, and ENSTORE_CONFIG_HOST, control to which Enstore system the encp requests go. In order to allow a user to override the default control environmental variables distributed with the product, the encp product uses the UPS concept of "virtual" products. The basic idea is that everything in the encp table file is general, and everything in the virtual product enstore_variables.table file is user/installation specific.
Finally, when encp is setup, it creates a directory in the /tmp area where it stores debugging information and other non-user information. The user can ignore the files in the /tmp area.
As an aside, it should be noted that since Enstore is still
in development, no versions are cut. The complete UPS product structure is
finished. For new installations, typically we CVS checkout code, issue one
make command, and the product is ready to be used. We expect to cut
versions of Enstore when it is appropriate. These versions will not be
frozen,
Virtual Library -- A virtual library contains one and only one kind of media. For example, Enstore divides an STK Powderhorn library holding 50, 20 and 10 GB redwood media into at least three virtual libraries. In common usage, the term "library" in Enstore refers to a virtual library. Writes are directed to a specific (virtual) library, thus selecting the media.
Drives -- Drives are bound to special processes called Mover clients. The drives can be dynamically assigned allowing the number of drives to be less than the number of virtual libraries.
Volumes -- Are uniquely identified by an external label, which is known to the Media Changer.
Each of the servers listed below is discussed in further detail in its own section. Please refer to these sections for information on detailed functionality and specific command line interfaces.
When enstore is used to start/stop servers, the server must be specified by
using the full Ascii name specified in the configuration file. For example:
General Server Options | ||
---|---|---|
FUNCTION | SWITCH | DEFAULTS |
check if the server process exists | --alive | None |
turn on more alarms | --do-alarm levels | None |
turn on more verbose logging(DEBUGLOG) | --do-log levels | None |
turn on more verbose output(stdout) | --do-print levels | None |
turn off more alarms | --dont-alarm levels | None |
turn off more verbose logging(DEBUGLOG file) | --dont-log levels | None |
turn off more verbose output(stdout) | --dont-print levels> | None |
print a short help message about using the server | --help | None |
Enstore System Command Line Control | |
---|---|
OPTION | COMMAND |
start the Enstore system on the current node | enstore start |
start only the file_clerk on the current node | enstore start --just file_clerk |
start the Enstore system on the whole Enstore cluster | enstore Estart |
start only the file_clerk on stkensrv0 | enstore Estart stkensrv0 "--just file_clerk" |
stop the Enstore system on the current node | enstore stop |
stop only the log_server on the current node | enstore stop --just log_server |
stop the Enstore system on the whole Enstore cluster | enstore Estop |
stop only the file_clerk on stkensrv0 | enstore Estop stkensrv0 "--just file_clerk" |
stop and then start the Enstore system | enstore restart |
stop and then restart only the Inquisitor on the current node | enstore restart --just inq |
stop and then start the Enstore system on the whole Enstore cluster | enstore Erestart |
stop and then start only the file_clerk on stkensrv0 | enstore Estop stkensrv0 "--just file_clerk" |
Display enstore related processes on the local host | EPS |
Display enstore related processes for the whole Enstore system | enstore EPS |
Column Name | Type | Comments |
external_label | string [primary_key] | Volume name specified by user on volume creation; is used to display volume metadata. |
file_family | string ("none") | File family name, specified by user on volume creation; only files that belong to this family will be stored on this volume. |
media_type | string | Specified at volume creation; implies the block-size; used for writing. |
library | string | Specified by user on volume declaration; defines which (virtual) library currently holds the volume |
first_access | int (-1) | Unix time when user issues the first write command to copy data to the volume. Set by the Volume Clerk. |
last_access | int (-1) | Unix time when user last accessed the volume. Set by the Volume Clerk. |
declared | int | Unix time when the volume is declared available to the system. Set by the Volume Clerk. |
capacity_bytes | 64-bit int | Specified by user on volume creation; estimate of the number of bytes that would fit on the volume. |
blocksize | int | Set by the Volume Clerk; derived from the media type. |
remaining_bytes | 64-bit int | Specified by the user on volume creation; estimate of the number of bytes that would fit on the volume; updated by the Volume Clerk every time data are written to the media. |
eod_cookie | string ("none") | Tells the driver how to space to the end of the volume; it is driver specific; updated by the Volume Clerk when data are written on the media. |
wrapper | string ("cpio") | Wrapper method; currently specifies the format of the files on the volume. |
sum_rd_err | int (0) | Read error count; Volume Clerk increments this field when the Mover receives an error while reading from the volume. |
sum_rd_access | int (0) | Read access count; Volume Clerk increments this field every time a file is read. |
sum_wr_err | int (0) | Write error count; Volume Clerk increments this field when the Mover receives an error while writing to the volume. |
sum_wr_access | int (0) | Write access count; Volume Clerk increments this field every time a file is written. |
user_inhibit | string (d:"none" or "readonly", "noaccess") | Specified by user at volume creation; access level for this volume, updated by Volume Clerk. |
system_inhibit | string (d:"none" or "writing", "readonly", "full", "noaccess") | Administrator generated limitation on the kind of access permitted to this volume; updated by Volume Clerk when data are written on the volume, an error occurred while data were being written or the file size exceeded the remaining number of bytes on the volume. |
at_mover | tuple. First element is state string ("unmounted","mounting", "mounted", "unmounting"). Second is a mover name | Reflects state of volume. Used to keep track of volume mount state to avoid illegitimate mount requests. Transitions are as follows: "unmounted"->"mounting"->"mounted"->"unmounting"->"unmounted". All other transition and associated requests will be rejected |
The Volume Clerk does the following operations:
Function | Command |
---|---|
show the name of all the volumes | enstore vcc --vols |
show volume information | enstore vcc --vol volume_name |
add a volume | enstore vcc --addvol library file_family media_type volume_name capacity remaining_capacity |
delete a volume | enstore vcc --delvol volume_name |
restore a volume (do not restore files) | enstore vcc --restorevol volume_name |
restore a volume (restore files) | enstore vcc --all --restorevol volume_name |
find an appropriate volume on which to write the file | enstore vcc --nextvol library_name minimal_remaining_bytes file_family |
put volume into a new library | enstore vcc --newlib volume_name library_name |
clear system inhibitors to the volume | enstore vcc -clrvol volume_name |
mark no access to this volume | enstore vcc --noavol volume_name |
set the volume as read only | enstore vcc --rdovol volume_name |
start/stop backup of volume journals | enstore vcc --backup |
Column Name | Type | Comments |
bfid | string [primary_key] | bit file ID; uniquely identifies every file in the system. |
external_label | string | Volume name on which the file has been written; same as the external_label in the volume table. |
bof_space_cookie | string | Driver specific string telling how to space to the file on the media. A lexical sort of all bof_space_cookies for a given volume will yield a optimized traversal of the volume. |
complete_crc | int | crc of all the bits sent by the user. |
sanity_cookie | string ("(0,0)") | Number of bytes used for a sanity crc and the sanity crc itself. The sanity crc is just the normal crc but only for the 1st N bytes in the file. This allows the Mover to check early in the transfer process that it probably has the right user file selected; it at least will know if it has the wrong file. |
The File Clerk supports the following requests:
Function | Command | |
---|---|---|
show bfid of all the files | enstore fcc --bfids | |
show file information | enstore fcc --bfid=bit-field-ID | |
start/stop backup of volume journals | enstore fcc --backup | |
declare file deleted/undeleted | enstore fcc --bfid=BFID --deleted={yes/no} | |
restore file by name | enstore fcc --restore="file_name" | |
restore file by name and restore a path | enstore fcc --r --restore="file_name" |
It can be also accessed from a command line interface
Enstore does not limit the number of Library Managers and the relation between Library Managers and Movers is many to many. That is, one Library Manager may have many Movers associated with it and, one Mover may have many Library Managers associated with it. Information about Library Managers is contained in the Enstore configuration dictionary and is available to clients via the Configuration Server. Each Library Manager is specified in the configuration dictionary as follows:
configdict['cdf.library_manager'] = { 'host':'cdfensrv4', 'port':7515, 'logname':'CDFLM', 'norestart':'INQ', 'max_encp_retries':3, 'max_suspect_movers':3, 'max_file_size':(60L*GB) - 1, 'min_file_size':300*MB, 'suspect_volume_expiration_time':3600*24, 'legal_encp_version':legal_encp_version, 'CleanTapeVolumeFamily': '9940ACLN.CleanTapeFileFamily.noWrapper', } configdict['CDF-9940B.library_manager'] = { 'host':'cdfensrv4', 'port':7522, 'logname':'9940BLM', 'norestart':'INQ', 'max_encp_retries':3, 'max_file_size':(200L*GB) - 1, 'min_file_size':300*MB, 'suspect_volume_expiration_time':3600*24, 'legal_encp_version':legal_encp_version, 'CleanTapeVolumeFamily': '9940BCLN.CleanTapeFileFamily.noWrapper', } *.library_manager is the name of the Library Manager.Below is the description of all library manager (LM) keys
KEY DESCRIPTION DEFAULT host host name where the server runs. None port command communication port None logname name identifying the server in the log file None lock if specified LM will start in this state. Allowed values:locked, unlocked, ignore, pause, nowrite, noread unlocked max_suspect_movers if number of suspect movers on which a given volume failed >= of max_suspect_movers, this volume will be set to NOACCESS state. 3 suspect_volume_expiration_time remove entry from suspect volume list after this period of time None min_file_size minimal file size. 0 max_file_size maximal size of the file allowed by this library. 2GB-2kB blank_error_increment do not set volume to NOACCESS in case of FTT_EBLANK error until the number of erros exceeds max_suspect_movers+blank_error_increment. 5 legal_encp_version minimal encp version number allowed to acess enstore None CleanTapeVolumeFamily volume family for cleaning tapes None storage_group_limits minimal amount of drives that can be used by a certain storage group (fair share) when different storage groups compete for tape drives. None
Each Mover has an entry in the configuration dictionary describing the Library Manager(s) associated with it. This entry can be a single name or a list of names:
#for single LM configdict['9940B15.mover'] = { 'host':'stkenmvr15a', 'data_ip':'stkenmvr15a', 'port':7577, 'logname':'DBT15MV', 'statistics_path':'/tmp/enstore/enstore/DBT15MV.stat', 'norestart':'INQ', 'max_consecutive_failures': mvr_max_consecutive_failures, 'max_failures': mvr_max_failures,'compression':0, 'check_written_file': b_mvr_check_f, 'check_first_written_file':b_mvr_check_1st, 'max_buffer':1000*MB, 'max_rate': s9940b_rate, 'mount_delay':15, 'update_interval':5, 'library':'CD-9940B.library_manager', 'device':'/dev/rmt/tps0d0n', 'driver':'FTTDriver', 'mc_device':'0,0,10,17', 'media_changer':'stk.media_changer', 'do_cleaning':'No', 'syslog_entry':low_level_diag_pattern, 'max_time_in_state':1200, 'send_stats':1, } # for multiple LMs configdict['9940B16.mover'] = { 'host':'stkenmvr16a', 'data_ip':'stkenmvr16a', 'port':7578, 'logname':'DBT16MV', 'statistics_path':'/tmp/enstore/enstore/DBT16MV.stat', 'norestart':'INQ', 'max_consecutive_failures': mvr_max_consecutive_failures, 'max_failures': mvr_max_failures,'compression':0, 'check_written_file': b_mvr_check_f, 'check_first_written_file':b_mvr_check_1st, 'max_buffer':1000*MB, 'max_rate': s9940b_rate, 'mount_delay':15, 'update_interval':5, 'library':['CD-9940B.library_manager', 'test.library_manager'], 'device':'/dev/rmt/tps0d0n', 'driver':'FTTDriver', 'mc_device':'0,0,10,18', 'media_changer':'stk.media_changer', 'do_cleaning':'No', 'syslog_entry':low_level_diag_pattern, 'max_time_in_state':1200, 'send_stats':1, }
All movers periodically send messages to their library managers, notifying library managers about state of the mover. If mover is in the IDLE or HAVE_BOUND state the library manager can send a work to this mover from the list of its pending requests. The work dispatching will be discucced later.
Work can be prioritized. Bigger priority number means higher priority. Currently, write and read are both priority 1 for our test purposes. Any priority mechanism could be developed to replace the existing one. However, the system will exhaust all work for a volume, given that it has been mounted, regardless of priority.
The Library Manager tries to sort read requests according to file location on the tape. If a read request has been already sent to the Mover the next request to this Mover for the same tape will be for the file whose location number is higher than the current one. If the location number is less than the current one, it will be placed at the end of the request list.
Once a User request comes, the Library Manager tries to pick up the next available (marked as "idle") Mover and send a "summon" message to it. The purpose of this message is to cause a Mover to send a Mover Request to the Library Manager. Mover Requests are described in the next section. The mechanism of the selection of a particular Mover allows control of some error conditions and implementation of retry logic. For this purpose there is dynamic list of volumes on which write or read requests failed - Suspected Volumes List. It is keyed by the volume external label and contains sublists of Movers on which the request for this volume failed. This tells the Library Manager to not use the same Mover when the User retries its request. When the Library Manager "summons" a Mover it changes the Mover state into the Mover List to "summoned" and puts it into the Summoned Movers List. Every time the Library Manager sends a message there is a time out handler that is being invoked if a response does not arrive before the time out expires. The time out handler will retry to "summon" the Mover whose time out has expired and, eventually remove the Mover for which "summon" retries expire from the Mover List.
The Enstore system keeps unassigned read and write requests in a queue of unallocated (pending) work in the Library Manager. Once a request for the next work comes from the Mover ("idle" or "have bound volume: idle"), the Library Manager tries to change "at_mover" volume state to "mounting", and, if succeeds, puts the request in a "work at mover" queue and responds to the Mover with the appropriate ticket. The reason for this is to track the volumes for scheduling : the Library Manager must not submit to a Mover a request for a volume which is already in use by another Mover. It is the Mover, and not the Library Manager which completes the requests. The two Library Manager request queues are:
It is important to keep these queues consistent. Volume and reading errors are handled in the Mover and partially in the Library Manager.
Movers seek to transport data between media and users over a TCP socket. When "summoned" or having completed work, Movers contact the Library Managers seeking work. If the Library Manager has work, it sends a corresponding ticket to the Mover, which in turn mounts the volume if necessary and transfers the data between user and media. When the Mover completes some work it sends to the Library Manager a request for more work and if it gets a reply that there is no more work for it, it dismount a volume. A Mover may also have decided to dismount a volume unilaterally because it ran into trouble. But it actually does it receiving no_work reply from the Library Manager. Library Manager - Mover communications are in the tables below
Library Manager sends | Mover Sends |
summon | idle - ready to do work;
have bound volume:busy - doing work; or have bound volume:idle - volume is mounted but no work |
Mover sends | Library Manager may respond |
idle_mover | if work needs to be done - read/write; or no_work |
have_bound_volume | if reads/writes pending for the volume - read/write; or if no work - unbind_volume |
unilateral_unbind | no work |
Library Manager has just responded | Mover sends | Library Manager presumes |
read or... write | idle_mover | Mover crashed and was re-started |
have_bound_volume | look for work on that volume if work, give it if none, unbind_volume | |
unilateral_unbind | update Suspected Volumes List and respond with no_work | |
acknowledged a... unilateral unbind or.. idle Mover no_work | idle_mover | Mover is available for work, If more work available, bind a volume |
have_bound_volume | it has restarted, the Mover had a volume from a previous instance of me, tell it to unbind | |
unilateral_unbind | no work |
$ enstore lmc --getwork sphinxdisk.library_manager [{'callback_addr': ('131.225.81.23', 7600), 'encp': {'adminpri': -1, 'agetime': 0, 'basepri': 1, 'curpri': 1, 'delayed_dismount': 0, 'delpri': 0}, 'fc': {'bfid': '91548494800000L', 'complete_crc': 2048910256, 'external_label': 'flop1', 'location_cookie': '000000063488', 'pnfsid': '000200000000000000514A98', 'sanity_cookie': (9045, 2048910256), 'size': 9045}, 'lm': {'address': ('131.225.81.23', 7503)}, 'retry_cnt': 0, 'status': ('ok', None), 'times': {'t0': 915731751.891, 'job_queued': 915731758.979}, 'unique_id': 'sphinx.fnal.gov-915731758.238293-3793', 'vc': {'blocksize': 512, 'capacity_bytes': 1400000L, 'declared': 915469931.313, 'eod_cookie': '000000108032', 'external_label': 'flop1', 'file_family': 'sphinx', 'first_access': 915469958.425, 'last_access': 915728933.0, 'library': 'sphinxdisk', 'media_type': 'diskfile', 'remaining_bytes': 1291968L, 'status': ('ok', None), 'sum_rd_access': 0, 'sum_rd_err': 0, 'sum_wr_access': 0, 'sum_wr_err': 0, 'system_inhibit': 'none', 'user_inhibit': 'none', 'wrapper': 'cpio'}, 'work': 'read_from_hsm', 'wrapper': {'fullname':'/usr/hppc_home/moibenko/enstore_test/enstore/src/tst/ admin_clerk_client.pyc', 'gid': 5440, 'gname': 'hppc', 'inode': 0, 'machine': ('Linux', 'sphinx.fnal.gov', '2.0.35', '#1 Thu Jul 23 14:01:04 EDT 1998', 'i686'), 'major': 0, 'minor': 5, 'mode': 33268, 'pnfsFilename': '/pnfs/enstore/sphinx/t1/admin_clerk_client.pyc', 'pstat': (33204, 38881944, 5, 1, 6849, 5440, 9045, 915484948, 915484948, 915485267), 'rmajor': 0, 'rminor': 0, 'sanity_size': 65535, 'size_bytes': 9045, 'uid': 6849, 'uname': 'moibenko'}}, {'callback_addr': ('131.225.81.23', 7600), 'encp': {'adminpri': -1, 'agetime': 0, 'basepri': 1, 'curpri': 1, 'delayed_dismount': 0, 'delpri': 0}, 'fc': {'bfid': '91548792000000L', 'complete_crc': -1493930591, 'external_label': 'flop1', 'location_cookie': '000000073216', 'pnfsid': '000200000000000000514B40', 'sanity_cookie': (4538, -1493930591), 'size': 4538}, 'lm': {'address': ('131.225.81.23', 7503)}, 'retry_cnt': 0, 'status': ('ok', None), 'times': {'t0': 915731751.891, 'job_queued': 915731759.085}, 'unique_id': 'sphinx.fnal.gov-915731758.243841-3793', 'vc': {'blocksize': 512, 'capacity_bytes': 1400000L, 'declared': 915469931.313, 'eod_cookie': '000000108032', 'external_label': 'flop1', 'file_family': 'sphinx', 'first_access': 915469958.425, 'last_access': 915728933.0, 'library': 'sphinxdisk', 'media_type': 'diskfile', 'remaining_bytes': 1291968L, 'status': ('ok', None), 'sum_rd_access': 0, 'sum_rd_err': 0, 'sum_wr_access': 0, 'sum_wr_err': 0, 'system_inhibit': 'none', 'user_inhibit': 'none', 'wrapper': 'cpio'}, 'work': 'read_from_hsm', 'wrapper': {'fullname':'/usr/hppc_home/moibenko/enstore_test/enstore/src/tst/ backup.py', 'gid': 5440, 'gname': 'hppc', 'inode': 0, 'machine': ('Linux', 'sphinx.fnal.gov', '2.0.35', '#1 Thu Jul 23 14:01:04 EDT 1998', 'i686'), 'major': 0, 'minor': 5, 'mode': 33268, 'pnfsFilename': '/pnfs/enstore/sphinx/t1/backup.py', 'pstat': (33204, 38882112, 5, 1, 6849, 5440, 4538, 915487920, 915487920, 915488239), 'rmajor': 0, 'rminor': 0, 'sanity_size': 65535, 'size_bytes': 4538, 'uid': 6849, 'uname': 'moibenko'}},] [{'callback_addr': ('131.225.81.23', 7600), 'encp': {'adminpri': -1, 'agetime': 0, 'basepri': 1, 'curpri': 1, 'delayed_dismount': 0, 'delpri': 0}, 'fc': {'bfid': '91548494100000L', 'complete_crc': 1614017314, 'external_label': 'flop1', 'location_cookie': '000000055296', 'pnfsid': '0002000000000000005149F8', 'sanity_cookie': (7581, 1614017314), 'size': 7581}, 'lm': {'address': ('131.225.81.23', 7503)}, 'mover': 'sphinxdisk.mover', 'retry_cnt': 0, 'status': ('ok', None), 'times': {'in_queue': 1.77947795391, 'lm_dequeued': 915731760.655, 't0': 915731751.891}, 'unique_id': 'sphinx.fnal.gov-915731758.236400-3793', 'vc': {'blocksize': 512, 'capacity_bytes': 1400000L, 'declared': 915469931.313, 'eod_cookie': '000000108032', 'external_label': 'flop1', 'file_family': 'sphinx', 'first_access': 915469958.425, 'last_access': 915728933.0, 'library': 'sphinxdisk', 'media_type': 'diskfile', 'remaining_bytes': 1291968L, 'status': ('ok', None), 'sum_rd_access': 0, 'sum_rd_err': 0, 'sum_wr_access': 0, 'sum_wr_err': 0, 'system_inhibit': 'none', 'user_inhibit': 'none', 'wrapper': 'cpio'}, 'work': 'read_from_hsm', 'wrapper': {'fullname':'/usr/hppc_home/moibenko/enstore_test/enstore/src/tst/ admin_clerk_client.py', 'gid': 5440, 'gname': 'hppc', 'inode': 0, 'machine': ('Linux', 'sphinx.fnal.gov', '2.0.35', '#1 Thu Jul 23 14:01:04 EDT 1998', 'i686'), 'major': 0, 'minor': 5, 'mode': 33268, 'pnfsFilename': '/pnfs/enstore/sphinx/t1/admin_clerk_client.py', 'pstat': (33204, 38881784, 5, 1, 6849, 5440, 7581, 915484942, 915484942, 915485261), 'rmajor': 0, 'rminor': 0, 'sanity_size': 65535, 'size_bytes': 7581, 'uid': 6849, 'uname': 'moibenko'}}]getmoverlist
$ enstore lmc --getmoverlist sphinxdisk.library_manager [{'address': ('131.225.81.23', 7508), 'last_checked': 915731762.917, 'mover': 'sphinxdisk.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}]get_suspect_vols
$ enstore lmc --get_suspect_vols sphinxdisk.library_manager [{'external_label': 'flop1', 'movers': ['sphinxdisk.mover']}]loadmovers
$ enstore lmc --loadmovers happydisk.library_manager {'movers': [{'address': ('131.225.84.122', 7509),movers happdisk.library_manager 'external_label': 'flop1', 'file_family': 'happy.cpio_custom',dmovers sphindisk.library_manager 'last_checked': 922226963.097,--loadmovers sphinxdisk.library_manager 'mover': 'happydisk1.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}, {'address': ('131.225.84.122', 7511), 'external_label': 'flop1', 'file_family': 'happy.cpio_custom', 'last_checked': 922226968.162, 'mover': 'happydisk3.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}, {'address': ('131.225.84.122', 7513), 'external_label': 'flop1', 'file_family': 'happy.cpio_custom', 'last_checked': 922226922.662, 'mover': 'happydisk5.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}, {'address': ('131.225.84.122', 7510), 'external_label': 'flop1', 'file_family': 'happy.cpio_custom', 'last_checked': 922226929.446, 'mover': 'happydisk2.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}, {'address': ('131.225.84.122', 7508), 'external_label': 'flop1', 'file_family': 'happy.cpio_custom', 'last_checked': 922226947.063, 'mover': 'happydisk.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}, {'address': ('131.225.84.122', 7512), 'external_label': 'flop1', 'file_family': 'happy.cpio_custom', 'last_checked': 922226954.994, 'mover': 'happydisk4.mover', 'state': 'idle_mover', 'summon_try_cnt': 0, 'tr_error': 'ok'}], 'status': ('ok', None),del_work
$ enstore lmc --del_work rip6.library_manager rip8.fnal.gov-922223453.580484-30688 ID rip8.fnal.gov-922223453.580484-30688 {'status': ('ok', 'Work deleted')}change_priority
no exampleget_del_dismount
no example
The Mover is responsible for efficient data movement and as such is an integral part of the system. The architecture allows for performance critical code to be written in C thus allowing efficient access to fundamental OS features such as forking with minimal to no language overhead.
Although a Mover is bound to a drive, a drive may serve more than one virtual library, i.e., the Mover has a dynamic list of of Library Managers that it is supposed to service. This has two benefits. First, since a Library Manager handles only one type of media, a drive which handles multiple types of media (i.e. different capacity media) can be shared without a static partitioning of the system. Second, if we are partitioning resources in a library, we can assign a Library Manager to each type of use. For example, suppose Group A and Group B want to share the capacity of a library. Suppose half the tapes belong to Group A and the other half to Group B. We want to guarantee that Group A have one third of the tape drives, Group B have one third, and the last third be shared. The Movers can be configured to do this easily. And with some slight changes, this is how we can guarantee resources to data acquisition.
There has been a request to duplicate (write to two tapes) critical data. This feature had been discussed but not implemented as a specific method of implementation has not been decided upon. The following are among the possible implementations:
When the Mover starts up, the 'idle_mover' request/command is sent to each Library Manager configured and the responses from the Library Managers are acted upon. After the startup, the Mover waits until it is 'summoned' by a Library Manager.
When a Mover is summoned by a Library Manager, it will send one of three request/commands to the Library Manager that summoned the Mover:
Reads -- Once a volume is bound the Mover may read a volume and send data to a waiting encp program. The steps are:
Writes -- Once a volume is bound the Mover may receive data and write it to the volume. The steps are:
$ enstore mvc --status fndaprdisk.mover
** The mover process must have read and write access to the tape pseudo devices.
Linux tape devices are called /dev/nstX by default where X is Xth
serial device found on the system. X can change if devices are added or removed
on the bus. A script in the FTT product etc/mkscsidev.Linux creates
the files /dev/rmt/tpsNdMn where N is the bus number and M is the scsi
id of the device. N and M do not change if the bus changes (unless
scsi ids or controllers are changed) and so the enstore
configuration files do not need to change. $FTT_DIR/etc/mkscsidev.Linux
should be run at boot time; normally via /etc/rc.d/rc.local.
A sample rc.local:
The keys/values used in the above example are typical of a running system.
The blocksizes dictionary element specifies the size of a block on the
different devices known to the system. The database dictionary element
specifies where the Enstore database files are located. The backup
dictionary element specifies the node and directory of where the database backups
will go.
Please see the individual server sections for more in depth descriptions of
all the server keywords.
The Media Changer issues multiple simultaneous commands by forking processes
that do the work. A Media Changer parameter, MAXWORK, limits the maximum number of
simultaneous outstanding operations. If the Media Changer receives
mount/dismount requests while there are MAXWORK unfinished operations then
the new operations are ignored, the Mover request will time out, and the
Mover will reissue the mount/dismount request.
The reason for the MAXWORK parameter is because when the EMASS robot has an operation for ten minutes
it reports a timeout failure even though it eventually finishes the operation. The MAXWORK parameter
can be set to 0 when it is necessary to perform work on a robot.
The Media Changer returns three status values:
The Media Changer and the Media Changer Client support the following requests:
The Media Changer mounting agents:
Enstore writes detail error statistics to its log when a file is closed. A
separate mount/dismount log can be easily separated from the main log.
In addition to the above reports, the Inquisitor will make available on the
web, the contents of the configuration file, all current Enstore log files and
any additional log files useful to the user.
The Inquisitor will listen for command
line requests sent to it and will periodically check to see if it is time to
update information for any of the servers that it is monitoring. If so, then the
server in question is contacted and the resulting information is formated for
output to the various reports. The possible information gathered from each of the
servers and which report it ends up in are listed below. In addition to
gathering information from each Enstore server, the Inquisitor will collect
information from the log files on encp commands and report on the blocksizes
set in the Enstore config file.
Since the Inquisitor requests a new config file from the config_server
periodically, it is possible to dynamically change the way information is
displayed and the type of information that is displayed without restarting the
Inquisitor.
In addition to the information listed above, the Inquisitor will look for the
inq_timeout dictionary element in each of the individual server sections.
If present, the value of this dictionary element will be used to specify the
timeout frequency for monitoring this server. This is the same as if the
timeouts dictionary element mentioned
above contained a dictionary element for the particular server. For example,
in order to monitor the file_clerk every 65 seconds, the Enstore config file
must have one of the following in it:
In order to block monitoring of a particular server, set it's timeout value to
-1.
An example Inquisitor dictionary element is given below:
The ascii alarm file is located in the same directory as the log files and is
called enstore_alarms.txt. The Patrol alarm file is located in the same
directory and is called enstore_patrol.txt.
Resolving an alarm means the following -
Communications between clients and servers is implemented in the python modules udp_client.py
and dispatching_worker.py which contain the classes UDPclient and DispatchingWorker respectively.
For example, a Mover is a client of the edia_changer; i.e., it sends mount and dismount requests
to the media_changer and waits for replies. The client and the server may run on the same or
different machines and messages, that is requests and replies, are passed using the UDP network protocol.
UDP is not a guaranteed reliable protocol but the Enstore protocols, described later, implement reliability.
Generally, each server module has a corresponding client module that implements the client interface
to the server. For the media_changer, the Mover imports media_change_client.py, which
implements load and unload methods.
So, mover.py imports media_changer_client which encapsulates the media_changer interface and media_changer_client
imports udp_client which encapsulates the UDP communications. On the server side, Media Changer
imports dispatching_worker which encapsulates server UDP implementation.
So far, we have mentioned modules that are imported with the python "import" command. Within the
modules, there are python "new" commands that instantiate the corresponding classes.
All clients are themselves clients of the configuration server; so, each time they send
a request to their server, they send a request to the configuration server to get the address
of their server. In this way the configuration server is the only server that has a hard coded address.
When each process starts it is given the IP address and port number of its configuration server.
When a client is instantiated it determines a free UDP port on its machine on which it sends requests
to its server.
When the server reads a request it also gets the address (host, port) of the client which sent the request
and uses it to reply.
Client requests are called tickets and they are python dictionaries. The items in the dictionary
are agreed upon between the client and the server. For example the Media Changer ticket
must contain a volume id and a drive id.
One item required in the ticket dictionary is "work". The "work" item in the dictionary
is used in dispatching_worker as a method name and a corresponding method in the server
is called to perform the work that the client requests. For example, the Media Changer
ticket must contain a "work" item with a value "load" or "unload" and the Media Changer
server has methods named load and unload.
When UDPClient sends a request it first prepends a client identification stamp and a request time stamp (which
serve as a unique identification) to the
ticket and stringifies the result. Then it calculates a CRC of the message and appends a stringified
version of the CRC to the message. Finally it sends the message to the server and waits for a response.
The response format is client timestamp, response message, and server time stamp. If
the client receives any response that
does not start with the original client time stamp or if the wait for the response times out then
the request is resent. More about this later.
The server implementation of the protocol in DispatchingWorker does a select on a list of read
file descriptors which includes the socket (host, port) as issued by the configuration
server. The select is repeated if it times out.
When input is detected on the socket, the server reads the request; checks the check sum; unpacks
and saves the client id, the client time stamp, and the ticket;
converts the ticket to a python dictionary;
and calls the method specified by the "work" item in the ticket. If any of these things fail
then the request is ignored presuming the client will resend the request.
The "work" is a text string but python is interpreted and allows runtime evaluation of method
names. In the Media Changer the load method is in the media_changer.py module which has imported
and instantiated a DispatchingWorker class. When the work method is finished it calls the DispatchingWorker
method reply_to_caller with a status result ticket.
reply_to_caller builds a stringified reply with the client time stamp, the status ticket, and
its own time stamp. It sends the reply to the client; saves the client address, the client message id,
and the complete reply in case of errors; and waits for more requests.
We save the complete reply in case of errors because we may get a request resent to the server which
the server has executed but whose response was not reliably returned. Some requests,
for example, mounting a tape are not redoable; so, we save the reply and simply resend the reply.
When DispatchingWorker gets a request it first checks its request dictionary for a request that has the
same client id and time stamp and if it finds a match it resends the reply rather than executing the request.
The request dictionary contains all replies.
If it grows beyond a certain size (currently 1000 entries) then entries older
than 30 minutes are deleted.
The scheme described so far requires that servers handle one request at a time and that clients
queue in the servers udp input buffer waiting their turn. This is satisfactory if requests
are guaranteed to finish quickly; however, the Media Changers operation may take a long time to complete
while other operations might be done simultaneously. To accommodate this, dispatching_worker was extended
to allow forking in servers.
The select in dispatching_worker now watches for input from the client socket and a list of pipe
fds on which the forked servers report their final status. The parent server process then
reports this status back to the client.
The database used in Enstore must provide the following:
We have examined LibTP, and find that it meets the current modest database
requirements of the project. We have exploited the "freeware" aspect
of it putting up many test stands.
We could replace LibTP with a Run II standard
database. However, we have no definite plans to do this, given our experience
with the tool, and the lack of any driving requirement to do this.
In addition to the databases, the File Clerk and Volume Clerk also maintain
separate journal files.
These journal files can be used to recover databases when they can not
be recovered under normal circumstances.
The base server protocol is the same for all servers. State-fullness is
minimized, not eliminated.
Each transmission has a unique ID, timeout and maximum number of retries
associated with it. The timeout allows for debugging.
For each reception, the "message" is checked against messages received to see
if the reception is a repeat. If the reception is a *repeat request*, send a
saved copy of the response; if the reception is a *repeat response*, just
ignore it. This will take care of the case when a timeout/retry happens just
before a response is received.
Some transfers do not require replies and using a described UDP communication
may even hurt the system performance. One of such examples could be messages sent to the Log Server. For this purpose "pure" UDP messaging is used.
NOTE: The communications between the Mover and the Configuration Server happens
approximately every two minutes. It has been added to the following drawing
to show that this communication is important, but it can occur anywhere in the
communications flow before the Mover contacts the Library Manager.
NOTE: The communications between the Mover and the Configuration Server happens
approximately every two minutes. It has been added to the following drawing
to show that this communication is important, but it can occur anywhere in the
communications flow before the Mover contacts the Library Manager.
Enstore is a distributed system. For a transfer to succeed, many of the
Enstore processes must be up and running. Therefore, the servers are
robust, and run on reliable computers. Nevertheless, it is good to consider
the intrinsic ability for the system to recover when a process or
system running a process crashes. This is summarized in the table below:
Process Where is State? Effect of Crash Encp In the user's encp transfer command Transfer is canceled/aborted Configuration Server Static configuration file Wait for restart of server File Clerk Persistent database table Wait for restart of server Volume Clerk Persistent database table Wait for restart of server Library Manager In-memory lists of what work is queued, and what work
is at what Mover Recovery of state is not yet implemented. Recovery of state is possible through encp retries. Mover If busy, the current transfer + the current volume Encp retries writes, exits with errors on read. pnfs NFS Servers DBM database "file metadata" NFS retry mechanisms Media Changer In memory lists of work given to library micro State refreshed by Enstore UDP retry protocol mechanism Log Server None Logs are not written Inquisitor None Displays are not refreshed until restart
Much of the system state is stored within the user's encp client.
This allows the encp client to retry on a large number of different
errors. This retry is given a very high priority when it is received
by the Library Manager so the user doesn't have to wait again for
their job. It is the Library Manager's responsibility to ensure the
error is not just repeated; for example, on a volume read error, the
volume should not go to the same drive on a retry. The Library
Manager gets the retry, volume and drive information from the ticket. Many interesting errors are related to cases where the volume
cannot be written or read, or when it is suspected that volume is
jammed, etc. More experience is needed with the actual hardware
before the correct error control behavior is established. In the
interim, Enstore will make the following working assumptions:
If a drive has several fatal errors on different volumes, the
drive will be marked offline. Error Code Description Administrator Responsibility Mover Responsibility Library Manager Responsibility encp Responsibility Retry Volume
State Drive
State Volume Write Errors WRITE_NOTAPE
Requested volume was not found in the library. Volume Clerk's data base is inconsistent with library micro's database.
use case
Check volume in morning
Mark volume no access Retry
Yes
No access
WRITE_TAPEBUSY
Requested volume is in another drive. Enstore bug,
or some other system has mounted volume or library micro put
volume elsewhere.
use case
Check volume in morning
Mark volume no access Retry
Yes
No access
WRITE_DRIVEBUSY
A volume is already in drive. Enstore bug or
misconfiguration. Note: Mover waits for automatic cleaning tape to
be ejected.
use case
Check drive and configuration in morning
Offline the drive
Retry
Yes
Offline
WRITE_BADMOUNT
Mount failure or load operation failed. Must
assume jammed volume.
use case
Check drive and volume in morning
Mark volume no access Offline the drive
Retry
Yes
No access
Offline
WRITE_BADSPACE
EOD cookie does not produce EOD. Wrong volume,
Enstore bug or drive space error.
Check drive and volume in morning
Mark volume no access Offline the drive
Retry
Yes
No access
Offline
WRITE_ERROR
Error writing data block or file mark.
use case
Check drive in morning
Check volume in morning
If several errors occur with different volumes,
offline the drive. If several errors occurs with different drives,
mark volume as read only
Retry
Retry
Yes
Yes
Read Only Offline
WRITE_EOT
Hit EOT while writing data block or file mark.
Mark volume as full and read only
Retry
Yes
Read only
WRITE_UNLOAD
Error unloading volume from drive. Must assume
jammed volume.
Check drive and volume in morning
Mark volume no access Offline the drive
Not involved
No access
Offline
WRITE_NOBLANKS
No more blank volumes.
Administrator should be paged. DAQ should switch to alternate library.
No
WRITE_MOVER_CRASH If Mover is connected to an encp, encp will notice
its sockets being torn down prematurely.
Check drive and volume in morning
Mark volume as no access, retry
Yes
No access
Offline
Volume Read Errors READ_NOTAPE
Requested volume was not found in the library. Volume Clerk's data base is inconsistent with library micro's database.
Check volume in morning
Mark volume no access No
No access
READ_TAPEBUSY
Requested volume is in another drive. Enstore bug,
or some other system has mounted volume or library micro put
volume elsewhere.
Check volume in morning
Mark volume no access No
No access
READ_DRIVEBUSY
A volume is already in drive. Enstore bug or
misconfiguration. Note: Mover waits for automatic cleaning tape to
be ejected.
Check drive and configuration in morning
Offline the drive
Retry
Yes
Offline
READ_BADMOUNT
Mount failure or load operation failed. Must
assume jammed volume.
Check drive and volume in morning
Mark volume no access Offline the drive
No
No access
Offline
READ_BADLOCATE
Failed space or initial CRC's don't match. Either
file location cookie is corrupted, wrong volume in the drive or
drive cannot space properly.
Check drive and volume in morning
Mark volume no access Offline the drive
No
No access
Offline
READ_ERROR
Error reading data block. Run of the mill read
error.
use case
Check drive in morning
Check volume in morning
If several errors occur with different volumes,
offline the drive. If several errors occurs with different drives,
mark volume as read only
Retry
Yes No
No access
Offline
READ_COMP_CRC
CRC mismatch Drive and the volume are suspicious.
Corrupt file location cookie, drive space error, wrong volume in
the drive, etc.
Check drive and volume in morning
Mark volume as no access Offline drive
No
No access
Offline
READ_EOT
Hit EOT when reading. Corrupt file location
cookie, drive space error, or wrong volume in the drive. Should
have hit an EOF.
Check drive and volume in morning
Mark volume as no access Offline drive
No
No access
Offline
READ_EOD
Hit EOD when reading. Corrupt file location
cookie, drive space error, or wrong volume in the drive. Should
have hit an EOF.
Check drive and volume in morning
Mark volume as no access Offline drive
No
No access
Offline
READ_UNLOAD
Error unloading volume from drive. Must assume
jammed volume.
Check drive and volume in morning
Mark volume no access Offline the drive
Not involved
No access
Offline
READ_MOVER_CRASH If a Mover is connected to an encp, encp will
notice its sockets being torn down prematurely. The volume is tied
up at a Mover.
Check volume and drive in the morning
Mark volume no access
No
No Access
Offline
Other Errors ENCP_GONE
User has gone away while request is queued.
Unilateral unbind
No
TCP_HUNG
It appears that the data TCP link is hung.
Check with user in morning
Compute an anticipated transfer time for every
socket operation and abort the transfer if the actual transfer
takes more than three times the expected value.
No
LM_CRASH
Library Manager crashes, and loses its queue of
pending work. The encp's will never be called back, and will wait
forever.
Ping Library Manager, every N mins (30) to see if
its request has gotten lost.
MOVER_CRASH Mover is idle. The system degrades.
Check drive in morning
Remove from list when Mover fails to respond to
summon
Offline
ANY_UNMOUNT
Error unmounting volume. Volume is hanging in the
drive.
use case
Check drive and volume in morning
Mark volume no access Offline the drive
Not involved
No access
Offline
What follows are draft design notions; importing and exporting are
not yet fully implemented.
Exportable volumes are built in Enstore using the encp command, with the
command line switch --ephemeral, which specifies a temporary, "ephemeral" file family. An
ephemeral file family is a unique file family name created just for this encp
command, with a file family width of exactly one. Under these conditions, files will
be placed on the tape volume in the order specified by the user. Once the data is written to the
tape, the file family name is changed to the tape_label_name.ephemeral.
An experimenter wishing to build an exportable volumes would follow these steps:
As an example, an experimenter can stage an entire CPIO exported volume using
gnu CPIO at her home institution. (Since there are many files on an
Enstore tape, special care should be taken to select a non-rewind tape
device.
On a UNIX system, CPIO tapes can be read with no special infrastructure
other than
gnu CPIO. For example, here is a simple script to read an exported tape at
a home institution:
Although it is not mandatory, an Enstore tool will very likely have to be produced to get good
results for experimenters at home institutions wishing to make importable tapes.
If the tapes are to be accessed many times, the experiment must take some
time to think about the layout of data on the tapes and how the files ought
to fit into the Enstore name space.
Since the objective is to import a large amount of data, it is required to
generate metadata for each tape; otherwise the tape will have to be scanned to
determine the metadata and this defeats the purpose of importing volumes!
Metadata for each tape is:
Metadata for each file is:
The procedure the user would follow to create an importable volume is:
The following test system is installed in the Feymann Computing center
and runs several enstore systems. One is for HPPP systems development
and testing and the othe is used by the D0/SAM project. D0/SAM has
gigabit ethernet access to the enstore environment and uses the
system for testing and presented it at Super-Computing 98.
The RIP/Enstore hardware test system consists of the following:
One node is connected to two STK redwood tape drives in a STK Powderhorn robot.
The RIP cluster is located physically adjacent to the SAM cluster,
which has similar PC's. For testing throughput to/from clients, the
SAM cluster can be connected to the RIP cluster via 10, 100, and 1000
Mbps network uplinks (though each SAM node has only 10/100 Mbps
capability).
Enstore has been installed on the Test System and is fully operational.
The administration of the system is flexible and can be changed by
modifying a single configuration file.
Currently, the configuration is as follows:
In the past, we have also tested exabytes on AIX machines. We have not
continued with this effort, but rather have concentrated on drives and cpus
that will most likely be used in conjunction with the EMASS robot.
Function Person Software Initiate a specific transfer between tape and disk Experimenter encp Organize names Experimenter pnfs Choose library to write to Experimenter pnfs Create file families, administer width Experimenter pnfs Current status on web Enstore Administrator Inquisitor Summary status on Web Enstore Administrator Inquisitor Routine periodic monitoring TBD Patrol (or TBD?) + Alarm module Move volumes between shelf and library Experimenter Enstore Administration Utility Move volumes between shelf and out-of-system (includes new volumes) Experimenter Enstore Administration Utility Drain system Enstore Administrator Enstore Administration Utility Shutdown system Enstore Administrator Enstore Administration Utility (Re)Start system Enstore Administrator Enstore Administration Utility Interfacing to an experiment means placing Enstore in a larger
system context.
From the point of view of Enstore with network attached tapes, there is an
Enstore system which interfaces to the rest of D0 on the NIC-card cable
connector. In addition to the interfaces to the user and administrators (which
are intrinsic to Enstore software), there are other miscellaneous interface
issues associated with a real instance of the software, since the whole system
must conform to the Experiments, Division's and Laboratories system
constraints:
Type Constraint Imposed by Network Conforming Physical Media System Network Protocol Extensions System Network 16K minimum UDP Datagram size Enstore Network Traffic pattern to machines where one NIC card is not sufficient, System Site Location System Operations Run II Operations Software Framework (Patrol or TBD) System Operations Failure planning (i.e. broken tape library, Power) System Operations Upgrade-ability System Operations + Security Standard Administration System Security Authentication etc are TBD System All production hardware is to come from the D0 Budget, requisitioned by
D0. This includes the Enstore system, tape drives and other peripherals.
D0 have a baseline system design. From the point of view of the storage
management project, the main features of the D0 system are:
Given the discussion above, the main interface issues with D0 are:
To keep transfers efficient, it is important to have a network design
which avoids congestion.
It is important to characterize the rate achievable on a NIC card, and the
amount of CPU required to drive NIC cards. Typically, there is less CPU/BYTE
when writing to the network, than when reading from it. One vendor reports, for
CPU's available in late 1998:
On an analysis server, allocation of streams to NIC cards is most easily
accomplished statistically. For machines with very many potential streams,
this is best effected by fewer, fatter pipes. Ideally, there is little packet
loss and traffic is regulated by the TCP window. If Jumbo frames were
acceptable not generally, but only between the Enstore system and the D0
analysis machine, statistical load balancing over four GB NICs consuming two
CPUS could easily sustain the (imagined) 150 MB/S peak tape rate
for D0, with very little congestion.
The problem becomes more difficult as the throughput of a NIC card
decreases. The basic unit of transfer is a stream carrying the full tape
rate. (i.e 5 MB/Sfor AIT-2 + some allowance for expansion). It is in fact,
a little questionable whether two such tape streams should be multiplexed onto
a single 100 MBPS NIC card -- Congestion, slow start and other rate-inhibiting
mechanisms may be invoked.
Other system design and integration issues
Enstore has 2 simple internal databases, the volume and file
databases and the pnfs databases. These databases are based on LIBTP and contain sufficient
information to read each file in Enstore or write new volumes to available volumes.
We believe LIBTP to
be reliable - although we could replace it with a commercial database such
as Oracle if required. We have developed backup scripts to recover from
potential database corruption. Finally, we expect the databases to be saved
in a SCSI RAID level 5 system for
redundancy and reliability. We also support live backups without any user impact.
D0's experiment catalog can also contain the basic information in our
databases. The initial loading of this information is done via the return
information from encp. And, since we do not expect any movement or compacting of data, this
information should not change. If it does, syncing methods will need to be developed.
Enstore also stores data in pnfs's databases, such as the bit file
id. These pnfs databases have been reliable in our experience and
they are supported by DESY.
Pnfs backup issues are not completely understood and will be treated in DESY visit this February
If an experimenter moves or deletes files in the pnfs
file system, it is immediately reflected in the pnfs databases. Therefore,
in the case of moved files, Enstore still has the correct pointer information
available to it for the transfers; for deleted files, the user won't
be able to find the file in the namespace and won't be able to start
the transfer.
In addition to the user namespace, Enstore maintains another namespace
that is ordered by file family, tape and position on the tape. The
user can't delete or move items (UNIX permissions) in this namespace
because it represents the physical ordering of files on
tapes. Recovery of accidental deletions from the user's namespace can
potentially be recovered by using this volume based namespace. [Not
yet implemented.]
No changes in the internal Enstore databases are required when a user
moves or deletes files since nothing has been moved or deleted on the
physical media. Initial tools are available to delete entire volumes,
nothing is planned for compacting data on volumes.
Enstore has implemented a special format, --data_access_layer, at the end of each file
transfer. The status of the current file transfer is available in this
way. The only successful status code is "OK" and an exit code of
0. All other values represent failures.
Another possible return is NOACCESS, which indicates the volume
can not be read. This return happens before submission to the library
manager's queue, so it happens very quickly.
In principle, since SAM has the volume information as Enstore and
could update its tables to reflect the NOACCESS returns. If the
volume is put back into service, a tool to reflect this change would
have to be developed.
Another possibility is for Enstore to flag the affected files from the
NOACCESS volume in pnfs in pnfs for the user.
A consistent, apriori way of marking all files from on a
volume unreadable after the volume has been declared unreadable has not
been fully worked through. (The general idea is that SAM never makes
an encp request for files on unreadable tapes.)
Enstore only one format is allowed per tape, it is not possible to mix
formats. Enstore tracks this information in its volume database and errors
out if it doesn't match.
Another more obvious method we are developing is to add the tape format
to the file family name. For example, the file family would be "top.cpio"
and not just "top". In this way, the user can select the format he wants
to use and tapes automatically have just one format (since different
formats would have different file family names.)
Enstore uses the pnfs namespace from DESY and it furnishes all
these items.
The library manager has commands to list all the volumes in the system or
file family as well as the statistics about specific volumes.
Enstore also manages a duplicate namespace that is ordered by file family
and volumes and position on tape that also supplies this information for
the user.
This has not been addressed.
Robot storage locations are controlled by the pnfs library
tag. This is user settable.
Enstore can divide a physical library into many virtual libraries. Mover
computers, and therefore, tape drives, can be assigned to one or more of
these virtual libraries.
Enstore treats 'shelves' as just another library. Tools are available to
change volumes between different libraries. Insert/Eject tools are being
developed for the EMASS robot to transfer volumes from robotic storage to
vault shelves.
Enstore is developing import and export tools that meet these
requirements as described in section 5
This has not been addressed.
This has not been addressed.
File families and widths are a fundamental design notions of Enstore.
Enstore allows the user to change the file family and width tags in pnfs. Regular
UNIX permissions prevent unauthorized changes.
Enstore allows a cp-like syntax where the input files can be a list.
Enstore tries to append to tapes to fill them to their capacity. Tapes can
also be marked "full" at any point
Enstore has developed a flexible wrappering module. We want to be able to
support any wrappering format an experiment chooses. We promote the use of
CPIO formats since it makes the tapes self-describing and readable on any
UNIX machine.
All volumes are inherently assigned to a file family before they are
written to by Enstore. The concept of File families has no meaning for reads.
Volumes can currently only be in one tape library.
Cannot do this dynamically, have static retry behavior, and will make
this comply to D0 needs.
Pnfs provides this with normal UNIX file permissions.
Enstore provides this through the mover config file and the normal UNIX
table routing files.
A simple round-robin plan is envisioned for multiple interfaces. This can be solved in any
specific case, we are not addressing the general case.
This has not been fully addressed.
The default value is the logged-in users default values
File families are inherited from the parent directory.
Libraries are also inherited from the parent directory.
Enstore's library manager maintains a queue of active work and pending
work.
encp option --priority
encp option --delpri and --agetime
encp option --delayed_dismount
Enstore has spent considerable time developing robust error handling
plans. The queues are cleaned up on errors or canceled requests.
Mover computers can be assigned to a specific library or multiple
libraries. They can be changed dynamically, by an administrator, to reflect
changing load conditions.
Enstore's philosophy is to retry internally and only return to the user a
success code or a fatal error. Much work has been done towards this goal.
encp option --data_access_layer provides this functionality, including retries.
Assuming a sufficient network connection, the only requirements are the encp client and the pnfs
namespace.
This is inherent in Enstore's design
Assuming a sufficient network connection, data should stream to tape at the
maximum tape speed. We have designed the mover module to get the most
performance out of the hardware are we can. Operations such as mounting,
spacing, rewinding, etc, slow the overall rate down
and are somewhat beyond the control of Enstore.
Disk buffering has not been excluded, as far as we can tell.
There is a maintenance contract on the robot.
Three Enstore people have attended training in Denver on the EMASS robot
The ESH Department is involved and helping to specify safe
operations. LOTO has already been instituted.
Enstore keeps track of drive errors and marks drives as "unavailable" when
they exceed some limit. It is expected that an administrator review the
bad drives in the morning.
Extensive 24 hour burn-in tests are planned for all repaired or replaced
drives to ensure that the end-user sees high quality drives.
Enstore's plans on queuing work requests for volumes that are not in the
robot into 3 categories:
It is expected that an operator will visit the robot at most once/day and
exchange at most 100 volumes per visit.
This issue has not been addressed fully.
Routine Live backups of Enstore databases almost finished now.
Pnfs backup is TBD
SCSI RAID level 5 system needed for extra protection
All logs are kept online for at least 30 days and stored to tape after
that. They are as complete as we can make them.
We are working on recovery procedures and expect to have them completed
shortly.
All volumes are self-describing. Metadata information can be recovered by
scanning the tapes. [This has not been written!]
Hopefully, normal transfers will allow a big enough sample to perform
routine checks.
This has not been fully addressed.
The basic format of this command is one of the following:
encp <input file> <destination directory
in pnfs space>
The exact syntax of the above may be changing somewhat, but is immaterial.
The following enhancements have been requested and (we think) agreed
to by Enstore.
The notational issues in this items have not been addressed.
Enstore provides cp-like list features. The user can launch many encps.
Input wildcarding is allowed and furnished by the user's shell glob capabilities. Output
wildcarding is more problematic -- Enstore allows the user to specify an
input list of files and an output directory; in this case the input names
are used in creating the output files in the directory ( The notational issues in this items have not been addressed.
Enstore provides cp-like list features and the delimiter in this case is a
space. Notification after each transfer is provided with the --data_access_layer switch. encp provides this with the --data_access_layer option
encp provides this with the --priority, --delpri and --agetime
options
encp provides this with the --data_access_layer option
Failed transfers are flushed from the Enstore queues.
SAM has used Enstore to write to both the STK robot and the EMASS robots.
Enstore uses all drives, not just the first ones, to allocate
work.
Enstore logs will be available on the web for users to inspect. We
welcome help!
Enstore will provide this capability, also available on the web.
Enstore logs are all single, sometimes very long, lines.
This is planned.
Enstore agrees but doesn't set this policy.
All SAM traffic is with files, so this request is contrary to the
SAM architecture. It is currently possible to write a list of files to
a tape. It is also possible to query the system to list the files that
are on a tape, and then use that list to copy all the files to the
disk. We believe these methods to be adequate. Otherwise,
this issue will have to be developed and represents new work.
Enstore will provide these.
Enstore will not implement any file set features.
This request is outside the mainline SAM architecture which has
indicated it wants priorities and optimal transversals of tapes.
Moreover, this request is not straightforward and requires defeating the Enstore
system in many ways and we'd prefer not to do it. More explicit needs can be addressed as they arise.
This can be done with ephemeral file families that have a width
of 1. Or, it can be done if only 1 user is writing to the file family
at a time.
This is part of the overall architecture, but it is not yet
developed.
This is TBD. We will discuss it with DESY during our upcoming
visit.
This is outside the main SAM architecture and there are no current plans to implement this.
Generally, the project is on track to its original estimate. I believe
Enstore needs its current work force of 6 people, (Bakken, Berman, Huang,
Moibenko, Rechenmacher, Ruthmansdorfer) or their equivalents until May 99.
After May the Run II effort could drop to 4 people. I expect to be able to
deliver a fully Run II functional version of Enstore by the end of July
99. At that point, I expect serious integration to be well underway with
D0. Depending on how well this commissioning goes, what further
requirements and enhancements are deemed necessary, the Run II Enstore
effort could drop to 2-3 people.
Beyond the Run II work, Enstore will require effort to fulfill its
Computer Division strategic role in Mass Storage for the Laboratory. It is
expected that this work will begin in the Summer 99 time frame. The scope
and requirements of this effort are not yet fully determined.
The basic design philosophy of Enstore is to use layered products when
possible. There are 2 products which need to be updated or enhanced to
make Enstore fully functional:
Below is the current Storage Management WBS. Instructions were to quit
working on it if it was correct to a factor of 2. I believe the actual
estimate is good to 0.5. I have attempted to fill in the percentage
complete column in the WBS with a step of 25%. Items listed as 0%
generally mean that no coding has been done for this item. It doesn't
mean no thought has gone into the item. Some thought has gone into each item.
{'wr_bytes': 6398896, 'rd_bytes': 6398896, 'no_xfers': 11, 'mode': 'w',
'bytes_to_xfer': 6398896, 'crc_func': '
where
1.3.4.2 Mover Config File Values
DICTIONARY ELEMENT
DEFINITION
DEFAULT
EXAMPLE VALUE
host
node where Mover runs
hppc
port
UDP port for Mover communication
7516
logname
ascii value used for id in messages to Log Server
FMOV
library
list of libraries that the Mover will contact when it starts up.
['fndaprdisk.library_manager']
media_changer
the name of the Media Changer server that will be communicated
with in order to load and unload tape cartridges.
'fndaprdisk.media_changer
mc_device
a device name or number to include with communications with the
Media Changer.
1
do_eject
used when testing stand alone tape drive (no robot).
'yes'
'no'
driver
the HSM driver
'FTTDriver'
device
device name used for driver device access. **
'/dev/rmt/tps2d2n' (make sure this is a no-rewind device)
norestart
do not restart this server if it crashes
do a restart
echo "Making scsi tape devices"
. /usr/local/etc/setups.sh
setup ftt
$FTT_DIR/etc/mkscsidev.Linux
chmod 0666 /dev/rmt/*
chmod 0666 /dev/sc/*
1.3.5 Configuration Server
The Configuration Server maintains and distributes all information about
system configuration, such as the location and parameters of each server.
Upon startup, each server asks the Configuration Server for the information
pertaining to itself (e.g. the location of any other server with which to
communicate). New configurations can be
loaded into the Configuration Server without disturbing the current running
system. Configurations are stored in a file called the Enstore configuration
file in Python dictionary format. An example of this file is given below:
configdict['blocksizes'] = { 'diskfile' : 512, \
'redwood' : 131072, \
'floppy' : 512, \
'cassette' : 512, \
'cartridge' : 512, \
'exabyte' : 131072, \
'8MM' : 131072, \
'DECDLT' : 131072 }
configdict['file_clerk'] = { 'host':'rip6', 'port':7501, 'logname':'FILSRV' }
configdict['volume_clerk'] = { 'host':'rip6', 'port':7502, 'logname':'VOLSRV' }
configdict['alarm_server'] = { 'host':'rip10', 'logname':'ALMSRV', \
'port' : 7503 }
configdict['log_server'] = { 'host':'rip6', 'port':7504, \
'log_file_path':'/rip6a/enstore/log' }
configdict['database'] = { 'db_dir':'/rip6a/enstore/db' }
configdict['backup'] = { 'host':'rip6', 'dir':'/rip6a/enstore/db_backup'}
configdict['inquisitor'] = { 'host':'rip6', 'port':7505, 'logname':'INQSRV', \
'timeout':10, 'alive_rcv_timeout': 5, \
'alive_retries':1, \
'ascii_file':'/rip6a/enstore/inquisitor/', \
'html_file':'/fnal/ups/prd/www_pages/enstore/', \
'default_server_timeout': 15, \
'timeouts' : { 'ait.library_manager': 15} }
configdict['rip6.library_manager'] = { 'host':'rip5', 'port':7506, \
'logname':'RP6LBM' }
configdict['dlt.library_manager'] = { 'host':'rip5', 'port':7509, \
'logname':'DLTLBM' }
configdict['rip6.media_changer'] = { 'host':'rip6', 'port':7512, \
'logname':'R6MC ', \
'type':'RDD_MediaLoader' }
configdict['de13.media_changer'] = { 'host':'rip10', 'port':7517, \
'logname':'DE13MC', \
'type':'EMASS_MediaLoader' }
configdict['rip6.mover'] = { 'host':'rip6', 'port':7525, 'logname':'R6MOV ', \
'library':'rip6.library_manager', \
'device':'/rip6a/rip6/rip6.fake', \
'driver':'RawDiskDriver', \
'mc_device':'-1', \
'media_changer':'rip6.media_changer' }
configdict['DE13DLT.mover'] = { 'host':'rip1', 'port':7526, 'logname':'DE13MV', \
'library':'dlt.library_manager', \
'device':'/dev/rmt/tps2d1n', \
'driver':'FTTDriver', \
'mc_device':'DE13', \
'media_changer':'de13.media_changer' }
1.3.5.1 Command Line Control of the Configuration Server
Configuration Server functionality may be controlled through a command line
interface using
enstore. A summary of the supported commands is given below. In addition
to the following commands, the Configuration Server command line interface
supports the general commands supported by all other servers.
FUNCTION
COMMAND
OUTPUT
load the specified Enstore config file into the configuration server
enstore cc --config_file=/path/to/config_file --load
output the currently loaded Enstore configuration file
enstore cc --dict
(same as the example in the previous section)
output the keys in the currently loaded Enstore configuration file
enstore cc --get_keys
['DE13DLT.mover',
'alarm_server',
'backup',
'blocksizes',
'database',
'de13.media_changer',
'dlt.library_manager',
'file_clerk',
'inquisitor',
'log_server',
'rip6.library_manager',
'rip6.media_changer',
'rip6.mover',
'volume_clerk']
1.3.6 Log Server
The Log Server receives messages from other processes and logs them into
formatted log files.
Basically, these messages are transactional records.
Log files are labeled by dates.
At midnight each day, the currently opened log file gets closed and another
one is opened. Below is an excerpt from the log file:
10:03:42 sphinx.fnal.gov 006849 moibenko I FILC File Clerk (re)starting
10:03:46 sphinx.fnal.gov 006849 moibenko I HLIBM Library Manager sphinxdisk.library_manager(re)starting
10:03:50 sphinx.fnal.gov 006849 moibenko I HMC Media Changersphinxdisk.media_changer(re) starting
10:03:55 sphinx.fnal.gov 006849 moibenko I HMOV Mover starting - contacting libman
10:03:59 sphinx.fnal.gov 006849 moibenko I ADMC Admin Clerk (re)starting
10:09:34 sphinx.fnal.gov 006849 moibenko I HLIBM read Q'd /pnfs/enstore/sphinx/ut1/mover.py -> ........
10:09:34 sphinx.fnal.gov 006849 moibenko I HLIBM read_from_hsm work on vol=flop1 ..........
10:09:34 sphinx.fnal.gov 006849 moibenko I HMOV READ_FROM_HSM start{'times': ..........
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV Performing precautionary offline/eject.........
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV Completed precautionary offline/eject.......
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV Requesting media changer load {' ............
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV Media changer load status('ok', None)
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV Requesting software mount flop1 ........
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV Software mount complete flop1 ........
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV WRAPPER.READ........
10:09:35 sphinx.fnal.gov 006849 moibenko I HMOV READ DONE{'unique_id': .............
Fields in a log file are:
1.3.7 Media Changer
The Media Changer mounts and dismounts the media into and from the drive according to a request from
the Mover. One Media Changer can serve multiple drives and libraries. When the drives are in the robot, the
Media Changer is the interface to the robotic software.
Tape Cleaning
The Media Changer is not directly involved with tape cleaning. The EMASS AMU and the
STK ACSLS tape library systems keep tape drive usage statistics and automatically
mount cleaning tapes. The Media Changer will not issue mount requests during the cleaning process.
Tape statistics
The Media Changer does not keep tape drive or cartridge statistics. Summary statistics
are not very useful and the media does not run on the machine connected to
the tape drive. The overall tape and drive statistics repository is OCS,
and a Enstore interface has not yet been designed.
1.3.8 Inquisitor
The Inquisitor obtains information from the Enstore system and creates the
following reports using this information:
The reports are updated periodically based on
timeout values in the Enstore
config file directing the Inquisitor to gather
each servers' information on a specific time frequency. Each Enstore server may
have its own unique timeout value specified for it. For example, the Inquisitor
may be instructed to gather information from the file_clerk every 60 seconds but
from the log_server every 135 seconds. However the plots are not updated
automatically and may be updated by a user initiated command or by a cron job
for example. The information for plotting is obtained from the log files.
SERVER
INFORMATION GATHERED
REPORTS EFFECTED
blocksizes
continuous Ascii status file and
html snapshot file
encp command history
continuous Ascii status file and
encp html snapshot file
Alarm Server
alive status
continuous Ascii status file and
html snapshot file
Configuration Server
alive status
continuous Ascii status file and
html snapshot file
File Clerk
alive status
continuous Ascii status file and
html snapshot file
Inquisitor
alive status
refetch config file from config servercontinuous Ascii status file and
html snapshot file
Library Manager(s)
alive status
suspect volume list
mover list
work queuescontinuous Ascii status file and
html snapshot file
Log Server
alive_status
continuous Ascii status file and
html snapshot file
Media Changer(s)
alive status
continuous Ascii status file and
html snapshot file
Mover(s)
alive status
Mover activity statuscontinuous Ascii status file and
html snapshot file
Volume Clerk
alive status
continuous Ascii status file and
html snapshot file
1.3.8.1 Command Line Control of the Inquisitor
Inquisitor functionality may be controlled through a command line interface using
enstore. A summary of the supported commands is given below. In addition
to the following commands, the Inquisitor command line interface supports the
general commands supported by all other servers. Any particular server name
mentioned in the table below may be replaced by any legal server name.
FUNCTION
COMMAND
OUTPUT
get the maximum size of the ascii status file
enstore ic --get_max_ascii_size
maximum ascii size
get the maximum number of encp status lines displayed
enstore ic --get_max_encp_lines
maximum number of encp lines
get the html status file auto refresh rate
enstore ic --get_refresh
refresh time
get the frequency for monitoring the Volume Clerk
enstore ic --get_timeout volume_clerk
volume_clerk timeout value
get the frequency for looking for work
enstore ic --get_timeout
Inquisitor wakeup time
reset the maximum size of the ascii status file
enstore ic --max_ascii_size=40000
reset the maximum number of encp status lines displayed
enstore ic --max_encp_lines=13
recreate the Inquisitor plots
enstore ic --plot
recreate the Inquisitor plots, keep the data files and put them in /tmp.
enstore ic --plot --keep --keep_dir=/tmp
recreate the Inquisitor plots and put the plot files in /tmp.
enstore ic --plot --out_dir=/tmp
recreate the Inquisitor plots and use the log files located in the specified directory
enstore ic --plot --logfile_dir=/tmp/logs
recreate the Inquisitor plots and only plot information after the specified start_time
enstore ic --plot --start_time=1998-12-25
recreate the Inquisitor plots and only plot information before the specified stop_time
enstore ic --plot --stop_time=1998-12-31
recreate the Inquisitor plots and only plot information between the specified times
enstore ic --plot --start_time=1998-12-01 --stop_time=1998-12-31
reset the html status file auto refresh rate
enstore ic --refresh=60
reset the frequency for monitoring the alarm_server to the value in the config file
enstore ic --reset_timeout alarm_server
reset the frequency for looking for work to the value in the config file
enstore ic --reset_timeout
reset the frequency for monitoring the file_clerk
enstore ic --timeout=55 file_clerk
reset the frequency for looking for work
enstore ic --timeout=10
close the current ascii status file and open a new one
enstore ic --timestamp
monitor the Log Server now
enstore ic --update log_server
monitor all the servers now
enstore ic --update
1.3.8.2 Inquisitor Config File Values
The Inquisitor looks for the following values in the Inquisitor section of the
Enstore config file. The default value is used if the dictionary element is not
found. Dictionary elements with no default must be specified in the Enstore
config file. All frequencies are specified in seconds.
DICTIONARY ELEMENT
DEFINITION
DEFAULT
alive_rcv_timeout
seconds to wait for response to alive request
5
alive_retries
times to retry alive request
2
ascii_file
directory for ascii status file(s)
./
default_server_timeout
frequency to monitor servers not listed in timeouts
60
host
node where Inquisitor runs
html_file
directory for html status files
./
logname
ascii value used for id in messages to Log Server
INQS
max_ascii_size
maximum allowed size (bytes) of ascii status file
max_encp_lines
maximum number of encp lines to display
50
port
udp port for Inquisitor communication
refresh
frequency for auto-refresh of html status page
120
robot_adic_log_dir
location of adic log files to point to in the Inquisitor log page (NOTE: replace 'adic' with other text to add a link to a different log directory)
timeout
frequency that Inquisitor looks for work
5
timeouts
dictionary of frequencies for monitoring each server
The value in the individual server dictionary element will take precedence over
the value in the Inquisitor dictionary element.
configdict['inquisitor'] = { 'alive_rcv_timeout' : 5,
'alive_retries' : 1,
'ascii_file' : '/tmp',
'default_server_timeout' : 15,
'host' : 'rip7',
'html_file' : '/fnal/ups/prd/www_pages/enstore/',
'http_log_file_path' : '/enstore/log/',
'logname' : 'INQSRV',
'max_ascii_size' : 100000000,
'port' : 7505,
'robot_adic_log_dir' : '/enstore/adiclog/',
'timeout' : 10,
'timeouts' : {'ait.library_manager': 15},
'www_host' : 'http://rip8.fnal.gov:' }
1.3.8.3 Example Inquisitor Reports
These examples reflect a running system on the rip cluster.
1.3.8.3.1 Example Ascii Status File
This file records a continuous history of the status of the Enstore system as
monitored by the Inquisitor. It contains the following information:
ENSTORE SYSTEM STATUS
DC03MAM.mover : timed out on (rip1, 7552) at 1999-May-27 13:50:34
last alive at ----
DC04MAM.mover : timed out on (rip1, 7553) at 1999-May-27 13:50:34
last alive at ----
DC05MAM.mover : timed out on (rip1, 7554) at 1999-May-27 13:50:34
last alive at ----
DC06MAM.mover : timed out on (rip1, 7555) at 1999-May-27 13:50:34
last alive at ----
DM07AIT.mover : timed out on (ripsgi, 7556) at 1999-May-27 13:50:34
last alive at ----
DM08AIT.mover : timed out on (ripsgi, 7557) at 1999-May-27 13:50:34
last alive at ----
DM09AIT.mover : timed out on (ripsgi, 7558) at 1999-May-27 13:50:34
last alive at ----
DM10AIT.mover : timed out on (ripsgi, 7559) at 1999-May-27 13:50:34
last alive at ----
DM11AIT.mover : timed out on (ripsgi, 7560) at 1999-May-27 13:50:34
last alive at ----
DM12AIT.mover : timed out on (ripsgi, 7561) at 1999-May-27 13:50:34
last alive at ----
adicr1.media_changer : alive on (rip10, 7521) at 1999-May-27 13:50:34
adicr1TOM.media_changer : alive on (rip10, 9521) at 1999-May-27 13:50:34
ait.library_manager : alive on (rip5, 7512) at 1999-May-27 13:50:34
SUSPECT VOLUMES : NONE
KNOWN MOVER PORT STATE LAST SUMMONED TRY COUNT
DM12AIT.mover 7561 idle_mover 1999-May-26 00:19:39 0
DM08AIT.mover 7557 idle_mover 1999-May-26 00:19:39 0
DM11AIT.mover 7560 idle_mover 1999-May-26 00:19:39 0
DM07AIT.mover 7556 idle_mover 1999-May-26 00:19:39 0
DM10AIT.mover 7559 idle_mover 1999-May-26 00:19:39 0
DM09AIT.mover 7558 idle_mover 1999-May-26 00:19:39 0
No work at movers
No pending work
alarm server : alive on (rip10, 7503) at 1999-May-27 13:50:34
blocksizes : diskfile : 512, exabyte : 102400, DECDLT : 102400,
floppy : 512, cartridge : 512, redwood : 102400,
cassette : 512, 8MM : 102400
config server : alive on (131.225.164.14, 7500) at 1999-May-27 13:50:34
disk.library_manager : alive on (rip7, 7510) at 1999-May-27 13:50:34
SUSPECT VOLUMES : NONE
KNOWN MOVER PORT STATE LAST SUMMONED TRY COUNT
disk1.mover 7530 idle_mover 1999-May-26 16:23:33 0
disk2.mover 7531 idle_mover 1999-May-26 16:23:32 0
No work at movers
No pending work
disk.media_changer : alive on (rip7, 7520) at 1999-May-27 13:50:34
disk1.mover : alive on (rip7, 7530) at 1999-May-27 13:50:34
Completed Transfers : 0, Current State : idle
Last Transfer : Read 0 bytes, Wrote 0 bytes
disk2.mover : alive on (rip7, 7531) at 1999-May-27 13:50:34
Completed Transfers : 0, Current State : idle
Last Transfer : Read 0 bytes, Wrote 0 bytes
dlt.library_manager : alive on (rip5, 7514) at 1999-May-27 13:50:34
SUSPECT VOLUMES : NONE
No moverlist
No work at movers
No pending work
encp : 15:44:11 on rip4.fnal.gov by bakken (Data Transfer Rate : 2.62 MB/S)
1073741824 bytes copied to CA2252 at a user rate of 2.08 MB/S
15:43:39 on rip4.fnal.gov by bakken (Data Transfer Rate : 2.62 MB/S)
1073741824 bytes copied to CA2258 at a user rate of 1.88 MB/S
15:40:36 on rip4.fnal.gov by bakken (Data Transfer Rate : 2.68 MB/S)
1073741824 bytes copied to CA2257 at a user rate of 1.79 MB/S
15:34:30 on rip4.fnal.gov by bakken (Data Transfer Rate : 2.63 MB/S)
1073741824 bytes copied to CA2252 at a user rate of 2.11 MB/S
15:34:00 on rip4.fnal.gov by bakken (Data Transfer Rate : 2.64 MB/S)
1073741824 bytes copied to CA2258 at a user rate of 2.03 MB/S
15:30:57 on rip4.fnal.gov by bakken (Data Transfer Rate : 2.67 MB/S)
1073741824 bytes copied to CA2257 at a user rate of 1.84 MB/S
file clerk : alive on (rip6, 7501) at 1999-May-27 13:50:34
inquisitor : alive on (rip7, 7505) at 1999-May-27 13:50:34
log server : alive on (rip10, 7504) at 1999-May-27 13:50:34
mam.library_manager : alive on (rip5, 7513) at 1999-May-27 13:50:34
SUSPECT VOLUMES : NONE
KNOWN MOVER PORT STATE LAST SUMMONED TRY COUNT
DC05MAM.mover 7554 idle_mover 1999-May-26 00:19:39 0
DC03MAM.mover 7552 idle_mover 1999-May-26 00:19:39 0
DC04MAM.mover 7553 idle_mover 1999-May-26 00:19:39 0
DC06MAM.mover 7555 idle_mover 1999-May-26 00:19:39 0
No work at movers
No pending work
null.library_manager : alive on (rip7, 7511) at 1999-May-27 13:50:34
SUSPECT VOLUMES : NONE
KNOWN MOVER PORT STATE LAST SUMMONED TRY COUNT
null2.mover 7533 idle_mover 1999-May-26 16:23:33 0
null1.mover 7532 idle_mover 1999-May-26 16:23:33 0
No work at movers
No pending work
null1.mover : alive on (rip7, 7532) at 1999-May-27 13:50:34
Completed Transfers : 0, Current State : idle
Last Transfer : Read 0 bytes, Wrote 0 bytes
null2.mover : alive on (rip7, 7533) at 1999-May-27 13:50:34
Completed Transfers : 0, Current State : idle
Last Transfer : Read 0 bytes, Wrote 0 bytes
volume clerk : alive on (rip6, 7502) at 1999-May-27 13:50:34
1.3.8.3.2 Example Html Status Snapshot File
The html snapshot file contains the last known status of the Enstore system.
As such it will be a repeat of the last set of information in the Ascii status
file, formatted for browsing and minus the encp information.
1.3.8.3.3 Example encp History Snapshot File
Each encp history line contains the following information
ENSTORE SYSTEM STATUS
History of ENCP Commands
TIME
NODE
USER
BYTES
VOLUME
DATA TRANSFER RATE (MB/S)
USER RATE (MB/S)
15:19:22
rip8.fnal.gov
moibenko
21036
rip6-01
2.47
0.04
15:18:59
rip8.fnal.gov
moibenko
21036
rip6-01
0.664
0.0322
12:57:19
rip4.fnal.gov
bakken
1048576
CA2904
0.698
0.00589
12:57:04
rip8.fnal.gov
bakken
1048576
CA2903
0.703
0.00589
12:53:57
rip4.fnal.gov
bakken
1073741824
CA2905
2.7
2.06
12:53:41
rip8.fnal.gov
bakken
1073741824
CA2903
2.7
1.88
12:52:31
rip8.fnal.gov
bakken
1048576
CA2904
0.711
0.0058
12:49:51
rip8.fnal.gov
bakken
104857600
CA2902
2.38
0.496
1.3.8.3.4 Example Individual Transfer Activity Plot
This plot shows the history of individual transfers (and their size) over a
specified time interval. This includes both reads and writes.
1.3.8.3.5 Example Bytes Transferred/Day Plot
This plot shows the number of bytes transferred per day over a specified time
interval. This includes both reads and writes.
1.3.8.3.4 Example Mounts Per Hour Plot
This plot shows the number of mounts per hour for a single day.
1.3.8.3.4 Example Mount Latency Plot
This plot shows mount latencies.
1.3.9 Alarm Server
The Alarm Server maintains a record of alarms raised by other servers.
Since Enstore attempts error recovery whenever possible,
it is expected that raised alarms will need human intervention to correct the
problem.
Currently, alarms are raised when the following conditions are detected -
The alarm server compares a newly raised alarm with the previously raised ones
in order to not raise the same alarm more than once.
Raising an alarm means, the following -
Currently it is only possible to cancel an alarm via the command line.
1.3.9.1 Command Line Control of the Alarm Server
Alarm Server functionality may be controlled through a command line interface
using enstore. A summary of the supported commands is given below. In
addition to the following commands, the Alarm Server command line interface
supports the general commands supported by all other servers.
FUNCTION
COMMAND
OUTPUT
raise an alarm with root error of UNKNOWN and severity of WARNING
enstore ac --alarm
None
raise an alarm with the specified root error and a severity of WARNING
enstore ac --alarm --root_error="root_error"
None
raise an alarm with the specified severity and a root error of UNKNOWN
enstore ac --severity=severity_value
None
resolve the specified alarm
enstore ac --resolve=unique_id
None
get the name of the patrol file
enstore ac --patrol_file
patrol file name
1.3.9.2 Alarm Server Config file Values
The Alarm Server looks for the following values in the Alarm Server section of
the
Enstore config file. The default value is used if the dictionary element is not
found. Dictionary elements with no default must be specified in the Enstore
config file.
DICTIONARY ELEMENT
DEFINITION
DEFAULT
host
node where Alarm Server runs
logname
ascii value used for id in messages to Log Server
ALARM_SERVER
norestart
do not restart this server if it crashes
do a restart
port
udp port for Alarm Server communication
1.3.9.3 Ascii Alarm File
The ascii alarm file that the Alarm Server creates stores all of the current
raised alarms. When the Alarm Server is started this file is read. Below is
an example file -
[927226812.665, 'rip7.fnal.gov', 13917, 'enstore', 'E', 'INQ_CHILD', 'CANTRESTART', {'server': 'DM12AIT.mover'}]
[927230044.586, 'rip7.fnal.gov', 18409, 'enstore', 'E', 'INQ_CHILD', 'CANTRESTART', {'server': 'DM07AIT.mover'}]
[927255672.586, 'rip7.fnal.gov', 835, 'enstore', 'E', 'INQ_CHILD', 'SERVERDIED', {'server': 'DM07AIT.mover'}]
[927255677.704, 'rip7.fnal.gov', 836, 'enstore', 'E', 'INQ_CHILD', 'SERVERDIED', {'server': 'DM08AIT.mover'}]
1.3.9.4 Patrol Alarm File
The Patrol alarm file that the Alarm Server creates stores all of the current
raised alarms in a format that Patrol can parse. Below is an example file -
rip7 Enstore 'E' INQ_CHILD on rip7.fnal.gov - CANTRESTART
rip7 Enstore 'E' INQ_CHILD on rip7.fnal.gov - CANTRESTART
rip7 Enstore 'E' INQ_CHILD on rip7.fnal.gov - SERVERDIED
rip7 Enstore 'E' INQ_CHILD on rip7.fnal.gov - SERVERDIED
Patrol was developed at SLAC and enhanced and modified at DESY. It is
currently in use at these institutions and at Fermilab. We have begun
investigating its use in association with the Enstore system.
1.4 Server Protocols
1.5 Trace
Trace is a utility to trace execution of code through information saved in
the circular buffer, residing in shared memory and available via special
commands. It was adapted from previous work where it has been used in the
real-time environment and is designed to have a minimal impact on the
performance of components of the system, as well as the overall performance.
Trace is widely used in all of the Enstore modules.
2 Databases in Enstore
Enstore uses databases to store persistent information.
Aside from the databases associated with pnfs, there are two databases, "file" and "volume" used by
File Clerk and Volume Clerk respectively.
The directory that contains all database related files is called the
"database directory" and is defined in configuration.
2.1 Current Underlying Database Implemented in Enstore
The current Enstore implementation uses LIBTP (http://www.sleepycat.com)(BSD DB v2.3) as the underlying
database product.
LIBTP is free for non-profit organizations like Fermilab, and has the
following features:
A LIBTP-Python shelve-like interface was developed. It provides access to:
LIBTP was chosen based on the following considerations:
2.2 Backup and Recovery Procedures
2.2.1 Backup
Backup is a stand-alone procedure which can be performed manually at
any time or routinely using a cron job.
Currently the files that are backed up are database
files which contain the persistent data, log files which record the
transactions and journal files which are secondary transaction records
implemented to further help the recovery of the database if there is a need.
Enstore does live backups.
It copies those files to a
designated directory on a remote host.
The remote host and directory are defined in the Configuration Server.
The backup procedure will perform the following actions:
Libtp database
Volume journal files
File journal file
Archives creation
Archives cleanup
2.2.2 Recovery
Recovery (restore.py) is a job initiated manually in case of database
corruption.
restore.py
2.3 Administrative Tools
*** Administrative tools will exist in a layer on top the current Enstore system
and as such will not require any redesign or reimplementation of existing code.
Administrative tools will provide the following operations:
In addition, tools will be provided to implement the administrative functions
mentioned in the D0 Functional Specification section of this document.
3 Communication Protocols
The base protocol for Enstore is UDP for "brief" messages and TCP for data
transfers.
UDP message sizes are all less than the size of the maximum UDP packet size so the
protocol is very simple.
3.1 Read Protocol
The communications performed during a read operation are illustrated in the
diagram below and described more fully in the following text.
3.2 Write Protocol
The communications performed during a write operation are illustrated in the
diagram below and described more fully in the following text.
4 Error Control
4.1 Assumptions about Errors
Enstore conforms to the (oral) statements made about Run II operating conditions:
4.2 Error Overview
4.3 Detailed Error Discussion
5 Volume Import and Export
It is certainly possible to import and export information by copying disk resident data
from Enstore using the encp command.
However, given the need to move large amounts of data, and the wide spread
use of compatible tape drives and
media, it is usually more efficient to interchange tape volumes: that is, to write
tapes outside of Enstore and import them into the system, and to write tapes
inside of Enstore and remove them from the system. In this way, for example,
Enstore can be used as a kind of tape copy facility.
5.1 Volume Export
An experiment can make tapes in the Enstore system at Fermilab and
give the volumes to an experimenter, who can read the tapes anywhere.
Experimenters can optionally generate a metadata file which provides a
map of the exported tape. However,
some tape formats, such as CPIO, are sufficiently self-describing so that the tape may
be dumped to disk with standard utilities.
#!/bin/sh
# en_dump_tape A shell script to dump an Enstore CPIO format tape.
# This is pseudo code, not functional yet
$tape=$1
test ! -f $tape || exit 1 ## Try to make sure we have not selected a device file
while /bin/true ; do
mt -t $tape fsf 1 || exit 1
(dd if=$tape | cpio -o ) || exit 1
done
5.2 Volume Import
Volume import is suitable for repeated and sizeable
transfers of data. Volume import is not as easy as export and, therefore, it is not a good choice for small, occasional
transfers.
Planning is important.
Tapes in a robot require special labels that can be automatically
scanned and placed in the robot. The labels must be unique within a robot and
within an Enstore system. You will need to procure tape volumes with labels
meeting these requirements. Imported tapes are read-only in Enstore. (Details
of these requirements are TBD, pending the serial media working group
decision).
The procedure for importing a volume is:
6 Test System
The Enstore hardware test system was designed to be able to test
data movement at rates comparable to requirements for Run II data
logging, to evaluate gigabit networking technologies, and to
determine scaling for the larger amount of hardware, required to
support all of Run II data handling. The ability to sustain data rates comparable
to Run II data logging requirements allows the test system to double
as the RIP (Reconstruction Input Pipeline) test platform.
Two nodes are connected to tape drives on the EMASS robot. One node,
via a single wide differential SCSI bus, is connected to four Sony AIT
tape drives. The other is connected via a single bus to four Quantum
DLT 7000 drives, and two Exabyte Mammoth drives.
NODE
SERVERS
rip1
4-AIT Media Changers
2-AIT Media Changers
4-DLT Movers
rip2
4-AIT Movers
rip3
cluster console
rip4
General Use
rip5
Disk Library Manager
AIT Library Manager
DLT Library
Manager
Mammoth Library Manager
Redwood-50 Library
Manager
Redwood-20 Library Manager
rip6
Configuration Server
File Clerk
Volume Clerk
Admin
Clerk
Log Server
Inquisitor
Alarm Server
Disk Mover
Disk
Media Changer
rip7
General Use
rip8
General Use
Serves Home areas
rip9
2-STK Movers
rip10
4-DLT Media Changers
STK Media Changer
6.1 Test System Results
Test System Configuration:
2 AITS
(15 of 84 tapes allocated)
1 Mammoth
(15 of 42 tapes allocated)
2 DLTS
(15 of 84 tapes allocated)
2 Redwoods
(5 of 200 tapes allocated)
Preliminary Rates Measurements:
Device
Writing
Reading
Network
"Mem->tape"
(MB/S)
(MB/S)
(Mbits/S)
(MB/S)
AIT
2.7
2.7
94
2.7
Mam
2.8
2.7
94
2.8
DLT
4.9
4.8
94
4.9
STK
8.8*
7.5*
83
9.8
Preliminary CPU Utilization for AIT transfers:
Mover
User's Encp
no crc
5-10%
5-10%
crc
10-20%
10-20%
7 Interfaces and Integration
Below is an (incomplete) summary of interfaces that are intrinsic to
the Enstore software. They are specified and coded as
part of the software project.
10-11 MB/S on a 100 mbps ethernet NIC
30-35 MB/S on a Gbit ethernet card (standard MTU)
70-80 MB/S on a Gbit ethernet card (jumbo MTU)
30-35 MB/S/CPU standard ethernet frames
78 MB/S/CPU "jumbo" (~9000 byte) frames.
(n.b. GB ethernet here...)
8 D0 Requirements
As stated in the introduction, D0, and most specifically, SAM, has been very helpful in setting the
direction for what is needed from Enstore. We believe a close and working
collaboration has been developed in which both SAM and Enstore have
profited.
In the followin subsections, we present the requirements we have received from D0 and try to
indicate how we fulfill them. We want to work with D0 to satisfy them
all. This should be possible since we control the source code.
8.1 Summary of D0 Functional Specifications
The Functionality D0 expects of a Storage Management Layer
There are 6 major functional areas. They are described in more detail
and broken down further below.
1) Cataloging and Database Functions
1.1) Maintenance of the primary "database" of file
to tape volume information
1.2) Provide access to File namespace
and Volume Information
2) Specification and control of Tape storage locations
3) Control of parameters which govern the functional
behavior of the system
3.1) Control of parameters which govern allocation
and use of tape drives
3.2) Control of parameters which govern how files
are written to tape
3.3) Control of parameters which govern how files
are read from tape
3.4) Control of parameters which govern access
to files and volumes
3.5) Control of parameters which govern network
routing between storage system Movers and client machines
3.6) Ability to set defaults for many/most of
the above parameters
4) Management of the robot resources (including error
recovery and tracking)
5) Movement of Files
between users machine/local disk and tape in robot.
6) Operational procedures to run and manage the robot,
tape drives and "databases"
6.1) Robot and Tape Drive Hardware
6.2) Operator procedures for import/export of batches
of tapes
The policy for which one is chosen is TBD and will be up to the experiment.
6.3) Quality assurance procedures to assure integrity
of the data and metadata
8.2 Sam/Enstore Interface Notes.
Below are notes on the SAM/Enstore interface provided by D0.
Enstore complies in all software features except where noted.
For the most part, Enstore complies in software features. Exceptions
are noted.
SAM/Enstore Interface
The SAM data access layer uses the command/executable provided by Enstore
to issue file commands.
encp <file in pnfs file space> <output
file>
Request to Enstore
Implementation proposed
Rationale
Allow wild cards in input or output file spec. As each file
arrives some notification should be provided.
Enstore will implement notification by writing a message to stdout.
Permits a number of files to be supplied or dispatched serially with
one encp.
Allow list of comma delimited files in input or output file spec
Notification of each file arrival (or dispatch) as for wild cards.
Permits a number of files to be supplied or dispatched serially with
one encp.
At the end of each file transaction provide information about the physical
location of the file, its position on the tape, error/retries, which tape
drive it was written on.
This was originally discussed as being written to stdout along with
informational messages about the state of the copy job. Latest thoughts
appear to be to write all metadata related to the physical location of
the file and how it got there into a separate, but parallel pnfs file system,
into a file of the same name (we think?)
It is very convenient when doing queries in order to gather information
on files to optimize access patterns and when making reports, to
have all of the physical information on the files in the SAM Oracle file
and event catalog. Multiple pnfs query calls would be awkward and unsymmetric
with respect to files managed by SAM, but not stored in the Enstore Robot
space.
Allow additional parameters on the Enstore 'copy' command to control
the positioning of the job in the Enstore job queue. Initial priority,
Aging Delta Time and Priority Increment would be sufficient.
Exact implementation of the desired effect left to Enstore. Whether
at a certain priority a job becomes pre-emptive of a job already in progress
left for later stages of the project, after some experience with resource
allocation.
Need some degree of control over the ordering and priority of jobs
already submitted to the Enstore queue, in order to balance the flows of
data and minimize job latency where necessary, but without rigid allocation
of resources to particular access modes or projects
At the end of each file transaction provide information about the job
which copied the file - dwell time in queue, final priority, robot arm
wait time, file seek time, file transfer time and MBs, etc.
This is now going to be available in the parallel pnfs file metadata
file system
This information is needed by the Global Resource Manager in order
to feed into the algorithm which adjusts the rate of flow of jobs by access
mode.
When an Enstore job fails because of a tape error or failure of the
receiving encp (or network or whatever) the job queue of Enstore should
be cleaned up appropriately.
Could live without this in 1st implementation, but would be nice to
determine what is appropriate behavior in each of the possible failure
modes. We are expecting automatic retries when tape cannot be read or written
in a particular drive and the tape only marked as unreadable if tried in
n drives.
SAM does not wish to handle tape errors, tape statistics or retries
- merely to note relevant information on state of media and record drive
used in the File and Event Catalog
If the STK robot and a couple of drives cannot be hooked up with an
Enstore test system by October 1, then Enstore needs to emulate the delays
of a robot for Tape mount, File seek time, and File transfer time, in order
to test the Global Resource Manager.
Part of this is already implemented as a 'simple' model. Is this
adequate - it is not installed yet, SAM have not tried it.
Essential to simulate queuing for scarce resources - the tape drive,
and the network bandwidth.
8.3 Other D0, non-SAM Requests to Enstore
Besides the preceding requests from SAM, other D0 experimenters have
suggested features, based on their experience, that would make Enstore more
usable. We consider these requests valid and we will try to implement
them; however, since some of the requests are outside of the main
SAM framework, we assign them a lower priority and, in the cases where they
conflict with mainline SAM architecture, we will only start them after all
SAM requests have been satisfied and have extensive testing.
9 WBS and Effort Estimates
The original plan for Enstore was to start with a working prototype,
representing about 6-8 months of effort, and evolve the code to the
initial release of the Enstore product for Run II. This approach has been
followed. There has been substantial overlap and simultaneous development
in all phases the WBS plan. One of Enstore's original goals was to have a
working system during all phases of the development. This, too, has been
achieved. This goal has slightly lengthened and sometimes constrained
overall development. (For example, incompatibilities between the original
file/volume database design and the current one led to extra effort to
allow simultaneous operation of both design.) Overall, however, the ability
of D0 to test the Enstore software as it needed to, provided valuable
feedback to us allowing us to allocate effort to problem areas.
1
Storage
Management
24%
2
Management -
ongoing 50% of JAB
JPP[50%]
0%
3
D0 Liaison -
ongoing
JPP[20%]
0%
4
Hardware and
Interface Problem Resolution - ongoing (DJH)
JPP[20%]
25%
5
Working
Operations
0%
6
Working OCS
for operator mounts
0%
7
Working
Interface to Tape/Drive Repository
0%
8
Working FTT
with drive chosen by serial media working group
0%
9
Working
Drives and Media in Robot for D0
0%
10
Enstore
V1
JPP[450%]
51%
11
Organization
and Methods of Working
69%
12
Packaging
methods
75%
13
Coding
standards
100%
14
Development
tools
100%
15
Bug
reporting and tracking procedure (GNATS)
0%
16
Requirements
69%
17
Input and
specification from experiments
75%
18
Input and
specification from mss groups and operators
75%
19
Understand
commonality of requirements and iterate
75%
20
Understand
interfaces to other Run II Projects
75%
21
Understand
testing dates and scope
75%
22
Understand
hardware constraints
100%
23
Agree on
change control mechanisms
0%
24
Evolution
of Prototype to Run II Product
47%
25
Client
server framework
100%
26
Communications
protocol and errors
100%
27
Robustness
100%
28
Error
handling philosophy
75%
29
Component
Retries
75%
30
End-to-end
recovery
75%
31
Fault
tolerance and availability
75%
32
Reliability
75%
33
Encp
framework
83%
34
Design
evaluation
75%
35
Options and
switch analysis
75%
36
Optimization
75%
37
Binary
distribution studies
100%
38
Improvements
to Servers/Clients and Clerks Design
71%
39
Configuration
server and clients
75%
40
Library
manager and clients
75%
41
Media
Changer and clients
50%
42
Volume clerk
and clients
75%
43
File clerk
and clients
75%
44
Log server
and clients
100%
45
Mover
Modifications
38%
46
File
wrappering - self describing, different types, etc
50%
47
Optimization
75%
48
Read/Write
Entire Volumes
0%
49
FTT - new
drives to support
0%
50
Testing
Framework
63%
51
Debug and
integration framework
50%
52
Configure
Test Hardware Platform
100%
53
Database
framework
45%
54
Evaluation
of underlying database choice
25%
55
User Queries
25%
56
Fault
tolerance
75%
57
Backup
75%
58
Admin
tools
26%
59
Pnfs
75%
60
Web status
25%
61
User queries
and reports
25%
62
System and
Tape Monitoring and Statistics (Patrol)
0%
63
Volume
Import/Export
0%
64
Facility to
Export/Eject Tapes from EMASS Robot
0%
65
Facility to
Import Foreign Tapes to EMASS Robot
0%
66
Security,
with respect to Fermilab Policy
0%
67
Data
protection, Authentication, Access
0%
68
Accidents
0%
69
Documentation
JPP[600%]
50%
70
Integration
JPP
9%
71
Integration
with Experiment RIP and Production Farms
25%
72
Integration
with Experiment Data Handling and Analysis
0%
73
Commissioning
0%
74
Tuning
0%
75
Enstore
V2
JPP[250%]
0%
76
Support for
Commissiong of D0 before run starts
0%
77
Addition of
new features as discovered
0%
78
Enstore
V3
JPP[250%]
0%
79
Support for
run
0%
80
New features
discovered when there is beam
0%
81
Ongoing
Support
JPP
0%
10 Year 2000 Issues
Problems associated with two digit year differences fall into 3 broad
categories for the Enstore project: