DRMAA C binding library for Grid Engine

Overview

The purpose of this library is to provide an implementation of the DRMAA C binding standard based on Grid Engine DRM system. The best way to describe DRMAA is to cite the DRMAA-WG Charter:

For more details on the Global Grid Forum standards body working group refer to the page of the GGF DRMAA working group .

Documentation

The following DRMAA related documentation is recommended DRMAA 1.0 language independent specification, DRMAA C language binding specification, Grid Engine/DRMAA job state mapping table and HTML version of Grid Engine DRMAA C binding man pages (only those in section 3 with drmaa_* prefixes) .

Availability

For the current 5.3 release DRMAA is not and will not be available. For 6.0 it is planned to deliver everything needed to integrate applications via DRMAA with Grid Engine including a DRMAA library that can be linked dynamically. At this stage DRMAA is available as a source snapshot from the "maintrunk" of the Grid Engine. To get this source snapshot you must use the cvs option "-r VMAIN_STABLE_3_TAG" when you check out the sources as described under Download Source.

In this source snapshot the following Grid Engine binary architectures are known to be working:

There are three build targets that are related to DRMAA:

Building

The steps described in the Grid Engine build page are the same that also must be followed for building the DRMAA build targets. In case of Grid Engine binary architectures where the port is already done the command

compiles all regular Grid Engine binaries plus DRMAA build targets. To enforce compilation of DRMAA build targets for not yet ported binary architectures

can be used. To compile all Grid Engine binaries except the DRMAA build targets

can be used.

Upon successful build the DRMAA targets are found in the "*_MT" directories. Build targets in this directory are compiled in a way allowing them to be used in multi-threaded applications. Due to the JAPI service thread (see implementation of JAPI library ) this is necessary for major parts of the code that implements DRMAA. For the same reason a good amount of Grid Engine source code has been reworked to make it reentrant (where possible) or at least thread safe. As result of this effort MT safe functions are marked with comments such as

it is important to understand that with all these comments major assumptions were made. The current major assumptions are

if these assumptions are not met DRMAA library can not be expected to be working properly and will crash most probably! To overcome those constraints it is necessary firstly to link Grid Engine against the corresponding MT safe libraries (zlib, openssl, ...). Secondly those parts of the Grid Engine source code that makes use of those libraries must be reviewed and changed were necessary.

Example application

The C test program example.c is a good example of an application that uses the DRMAA C binding interface. It illustrates submission of both single and bulk remote jobs. After submission drmaa_synchronize() call is used to synchronize the remote jobs execution. The call returns after all the jobs have finished executing. Finally, drmaa_wait() call is used to retrieve and print out the remote jobs' execution information.

A full path for the remote command is passed as the first argument to the test program. That value is directly used as "drmaa_remote_command" job template attribute. The C binding example uses value "5" as a first argument to the job template vector attribute "drmaa_v_argv". Passing "/bin/sleep" as a first argument to the test program will for example cause 32 sleep jobs to be run that sleep for "5" seconds each before finishing execution. Note that we expect to find "/bin/sleep" command on all of the remote nodes.

To build this example run

This is necessary because "example" is not part of the standard build procedure.

Bugs

You can file bugs to Grid Engine Issuezilla "drmaa" subcomponent. Before doing so please have a look at the TODO list below and watch out for already submitted "drmaa" bugs. If you're not sure whether you found a bug sending mail to dev@gridengine.sunsource.net Mailinglist is recommended.

Notes on implementation

The Grid Engine implementation of DRMAA bases on Grid Engine Job API library (JAPI) . As a result DRMAA library inherits MT safety from JAPI library.

Interfaces operations of Grid Engine JAPI library and DRMAA library have lots in common. The big difference amongst them is that JAPI allows any characteristic of the Grid Engine job that is submitted through JAPI be influenced, whereas DRMAA allows only those characteristics be influenced that are interfaced by DRMAA. Note that this ostensible drawback of DRMAA library is not as meaningful as it seems: Firstly DRMAA job template attributes "drmaa_job_category" and "drmaa_native_specification" will allow nearly all of the qsub(1) job characteristics be influenced. Secondly by building an integration upon DRMAA library you don't need to change your application code each time when a new Grid Engine version is released. When using JAPI library interface you must be prepared to change your application code in this case. This is because JAPI directly exports Grid Engine internal data structures that usually are subject of change with new releases.

Grid Engine DRMAA library QA

The DRMAA application "test_drmaa" is used by Grid Engine testsuite QA tool to ensure quality of DRMAA library. It covers a growing number of tests. There are so-called standalone tests that can be accomplished autonomously by the test_drmaa application. To run all these tests standalone "test_drmaa" can be started with the "ALL_AUTOMATED" option plus some option arguments. To run all DRMAA related tests so far implemented choose "api/drmaa" in testsuite and have testsuite do the work for you.

Copyright 2003 Sun Microsystems, Inc. All rights reserved.