QRSH- Queue Remote Shell

Overview

Qrsh starts a job similar to qsub, with the difference that

stdin/stdout/stderr is not redirected to a file but to the callers iostreams, usually the current terminal
it is possible to start binaries
the returncode of the executed command is propagated to the caller by qrsh
if no command is specified, an rlogin session is started
if a special commandline option -inherit is specified, qrsh will start a subtask in an existing Gridengine parallel job

Qrsh uses the rsh/rshd mechanism (or any similar tool like ssh) to start the remote process and redirect io.

If nothing else is configured, qrsh will start $SGE_ROOT/utilbin/$ARCH/rsh, the rshd used is $SGE_ROOT/utilbin/$ARCH/rshd, which is a rshd derived from NetBSD code extended by some code to allow process control and collection of usage information (see also 3rdparty/remote).

To configure the system rsh/rshd or an other mechanism the values rsh_client and rsh_daemon resp. rlogin_client and rlogin_daemon in the cluster configuration have to be set.

Controll flow

Remote execution

If a user submits a job with qrsh, the following actions are taken

The commandline is parsed and split into codine options and commandline to be executed
A job object is created
The job is submitted (communicated to qmaster)
qrsh waits for the job to be started; in regular intervals it requests the job status from qmaster, to detect if the job has eventually been deleted
qmaster sends order to start job to execd
execd starts shepherd
The corresponding shepherd contacts qrsh over a socket connection and passes the execution host and the port on which a rshd will be started.
qrsh forks and executes a rsh command that connects to the specified host and port number on the execution host, then it waits for the command to exit
On the execution side, rshd will start a qrsh_starter command
The qrsh_starter sets up the jobs environment, starts a users login shell and executes the specified commandline
After the command exits, the qrsh_starter writes the exit code of the command to a file and exits, rshd exits
The corresponding shepherd collects job information like usage and exit code, it communicates the exitcode to qrsh
qrsh exits with the exitcode of the command or an error, if an error in the mechanism occurred.

Remote login

If a user submits a login session with qrsh (rlogin), the following actions are taken

The commandline is parsed (codine options)
A job object is created
The job is submitted (communicated to qmaster)
qrsh waits for the job to be started; in regular intervals it requests the job status from qmaster, to detect if the job has eventually been deleted
qmaster sends order to start job to execd
execd starts shepherd
the corresponding shepherd contacts qrsh over a socket connection and passes the execution host and the port on which an rlogind will be started.
Qrsh forks and executes an rlogin command that connects to the specified host and port number on the execution host, then it waits for the command to exit
On the execution side, rlogind will spawn a login which creates a login shell
After the login shell exits, rshd exits
The corresponding shepherd collects job information like the usage and will communicate the job end to qrsh
qrsh exits with the exitcode 0 or an error, if an error in the mechanism occurred.

Process hierarchie

Client side

Qrsh forks a rsh or an rlogin command, if rsh shall handle stdin (no -nostdin option to qrsh), it forks another child process that handles stdin.

Qrsh -> rsh -> rsh

qrsh -> rlogin

Execution side

Standard

In the standard case, the command will be executed in a users login shell:

execd -> shepherd -> rshd -> qrsh_starter -> loginshell -> command

without login shell

If the option -noshell is passed to qrsh, the command will be executed directly without a wrapping login shell.

Execd -> shepherd -> rshd -> qrsh_starter -> command

with wrapper script

A wrapper script can be specified, that for example sets up a special environment, e.g. A clearcase view.

A wrapper script is defined in the environment variable QRSH_WRAPPER.

Execd -> shepherd -> rshd -> qrsh_starter -> wrapper -> command

rlogin case

Execd -> shepherd -> rlogind -> login shell