The basic format of this command is one of the following:
encp <input file> <destination directory
in pnfs space>
encp <file in pnfs file space> <output
file>
The exact syntax of the above may be changing somewhat, but is immaterial.
The following enhancements have been requested and (we think) agreed
to by Enstore.
Request to Enstore | Implementation proposed | Rationale |
Allow wild cards in input or output file spec. As each file arrives some notification should be provided. | Enstore will implement notification by writing a message to stdout. | Permits a number of files to be supplied or dispatched serially with one encmd. |
Allow list of comma delimited files in input or output file spec | Notification of each file arrival (or dispatch) as for wild cards. | Permits a number of files to be supplied or dispatched serially with one encmd. |
At the end of each file transaction provide information about the physical location of the file, its position on the tape, error/retries, which tape drive it was written on. | This was originally discussed as being written to stdout along with informational messages about the state of the copy job. Latest thoughts appear to be to write all metadata related to the physical location of the file and how it got there into a separate, but parallel pnfs file system, into a file of the same name (we think?) | It is very convenient when doing queries in order to gather information on files to optimize access patterns and when making reports, to have all of the physical information on the files in the SAM Oracle file and event catalog. Multiple pnfs query calls would be awkward and unsymmetric with respect to files managed by SAM, but not stored in the Enstore Robot space. |
Allow additional parameters on the enstore 'copy' command to control the positioning of the job in the enstore job queue. Initial priority, Aging Delta Time and Priority Increment would be sufficient. | Exact implementation of the desired effect left to Enstore. Whether at a certain priority a job becomes pre-emptive of a job already in progress left for later stages of the project, after some experience with resource allocation. | Need some degree of control over the ordering and priority of jobs already submitted to the enstore queue, in order to balance the flows of data and minimize job latency where necessary, but without rigid allocation of resources to particular access modes or projects |
At the end of each file transaction provide information about the job which copied the file - dwell time in queue, final priority, robot arm wait time, file seek time, file transfer time and MBs, etc. | This is now going to be available in the parallel pnfs file metatdata file system | This information is needed by the Global Resource Manager in order to feed into the algorithm which adjusts the rate of flow of jobs by access mode. |
When an enstore job fails because of a tape error or failure of the receiving encmd (or network or whatever) the job queue of enstore should be cleaned up appropriately. | Could live without this in 1st implementation, but would be nice to determine what is appropriate behavior in each of the possible failure modes. We are expecting automatic retries when tape cannot be read or written in a particular drive and the tape only marked as unreadable if tried in n drives. | SAM does not wish to handle tape errors, tape statistics or retries - merely to note relevant information on state of media and record drive used in the File and Event Catalog |
If the STK robot and a couple of drives cannot be hooked up with an enstore test system by October 1, then Enstore needs to emulate the delays of a robot for Tape mount, File seek time, and File transfer time, in order to test the Global Resource Manager. | Part of this is already implemented as a 'simple' model. Is this adequate - it is not installed yet, SAM have not tried it. | Essential to simulate queing for scarce resources - the tape drive, and the network bandwidth. |