Functional Specification Document for SMF support for SGE

Version 1.0   26/03/2008   Lubomir Petrik

1 Introduction

This document describes the Sun Grid Engine's SMF support. SMF (Service Management Facility) is new feature in Solaris 10. The purpose is to speed up boot time, provide permanent services and ease up the administration of services and their dependencies.

2 Project Overview

2.1 Project Aim

After the project is completed SGE should be able to:

  1. To be controlled over SMF administrative commands as well as old qconf interfaces (-km, ke) on Solaris 10+. MANDATORY
  2. Support all existing commands like: migrate qmaster, etc. MANDATORY
  3. Provide option -nosmf to installation scripts not to use SMF. MANDATORY
  4. We might provide an RBAC role solaris.smf.sge.access, solaris.smf.sge.modify and appripriate profiles SGE manager/SGE operator, so that SGE manager and operator can get it during the installation and are later allowed to start/stop SGE over SMF / modify the SGE SMF service manifest. This also means that the service manifest already support and check for these roles. OPTIONAL
  5. As an addition to 4. we might want to start each SGE processes with only necessary privileges. For increasing security OPTIONAL

2.2 Project Benefit

The SMF support can improve bootup time, administration of SGE for administrators used to work with SMF and SGE availability.

2.3 Project Duration

All estimated values are net times which mean working full time on the project without interruption.

Core development:

Task Duration in Man Weeks possible engineer state
Research 1 LP closed
Prepare 1 LP closed
Integrate 1 LP closed
Test 1 LP closed

Additionally:

Task Duration in Man Weeks possible engineer state
Doc support 1/3 LP open

2.4 Project Dependencies

Available Supplier Product/Project/Interface Dependency
now Sun Microsystems Solaris 10 OS

2.5 Milestones

  1. Research the SMF and SGE startup procedures
  2. Prepare all required information from SGE that SMF needs
  3. Integrate the SMF support to SGE start/stop/install/uninstall process
  4. Test and document

3 System Architecture

3.1 Enhancement Functions

3.2 Overall Block Diagram

4 Functional Definition

4.1 Performance

No impact on performance is expected. Booting machine might a bit faster.

4.2 Reliability, Availability, Serviceability (RAS)

Administrators will then have a chance to control SGE startup via SMF.

4.3 Diagnostics

The new util /sgeSMF/sge_smf.sh command is added. This command can be used for detecting if the SMF is available. We might consider not to document it though. Administators should use SMF commands to check if SMF SGE services are present on the system.

4.4 User Experience

SMF will be automatically used on Solaris 10 hosts unless -nosmf option is provided. User will have to additionally name the cluster he/she is installing. This cluster name will become part of the service name.

4.5 Manufacturing

4.6 Quality Assurance

4.6.1 Testsuite adjustments

The testsuite must know about a new installation option and should handle it correctly and test the correct behavior.

4.6.2 Testsuite tests

Testsuite tests will verify that a new sge_smf.sh wrapper script works. The new tests also include positive and negative tests for all options of this new command.

4.6.3 Tested the auto installation

The auto installation should be tested. Install template now contains cluster name.

4.6.4 Backup the SMF files test?

No backup needed. We will not support backing up the SMF repository. That can be done by the administrator. SMF also supports snapshots of the service manifests.

4.7 Security & Privacy

4.8 Migration Path

After upgrading to 6.2 version, the administrator will not have a chance to enable SMF support. The reason is that we need to reinstall the hosts in order to register them as SMF services.

OPTIONAL: We could provide a script that would connect to all existing hosts, remove RC scripts (any customizations would be lost) and enabled SMF from the new templates after the migration has been finished.

4.9 Documentation

No new man page will be added, sge_smf.sh command is a helper script and as such should not be used by the users. Only installation/users/administration guide will explain the SMF support.

4.10 Installation

Administrator can disable SMF support my adding -nosmf option to the install script.

4.11 Packaging

New files in the distribution will be at $SGE_ROOT/util/sgeSMF:
sge_smf.sh - script for import/deleting SGE services to/from the repository
sge_smf_support.sh - helper script for sge_smf
bdb_template.xml
qmaster_template.xml 
shadowd_template.xml
execd_template.xml 

At $SGE_ROOT/dbwriter/util/sgeSMF:

dbwriter_template.xml

4.12 Issues/Risks and Proposed Mitigation

  1. SMF needs an unique service name - this mean we need to ask user how to name the cluster
  2. Upgrade (migration) from RC to SMF will not be done automatically. See 4.8
  3. Getting the correct service dependencies might be difficult
  4. TBD

Category Risk Impact (L/M/H) Probability (L/M/H) Mitigation Plan Owner

5 Component Descriptions

5.1 Component Service Management Facility (SMF)

5.1.1 Overview

SMF is new feature in Solaris 10 providing unified model for controlling services. Replaces RC scripts, handles service dependencies, provides better service availability and speeds up boot process.

5.1.2 Functionality

Installation of each daemon should import appropriate service manifest to the SMF repository. Services will then be controlled by the SMF framework instead of RC scripts and startup scripts. Users no longer can use startup scripts, if they want to use SMF.

Since SMF can define multiple instances of the same service we do the following:
Define a unique name for a service and ask the user to provide a cluster name during the installation which will become the service instance.

OPTIONAL: When we release incompatible SGE version (increased cull version) we should provide a new service name since these services are no longer compatible. This needs also be done with there are any changes between the updates that break any functionality.

Qmaster service name example:
First release:
application/sge/qmaster:test
application/sge/qmaster:production
We release incompatible update:
application/sge_v62u1/qmaster:test

The reason why we should do this is that other service providers can depend on our service and they might expect certain version of the software/functionality to work. If we provide just application/sge/qmaster service any qmaster instance present on the system would satisfy such dependency.

After a discussion we decided not to do this as we don't expect anyone to depend on our services. Once that happens we still might do it, for now we don't add SGE_VERSION to the service name.

Service manifests are stored in SMF repository database in XML. Qmaster service manifest template example:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
  <service name='application/sge/qmaster' type='service' version='0'>
    <dependency name='network' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/milestone/network'/>
    </dependency>
    <dependency name='fs-autofs' grouping='optional_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/filesystem/autofs'/>
    </dependency>
    <instance name='test' enabled='false'>
      <exec_method name='start' type='method' exec='/grid/sge/default/common/sgemaster -qmaster %m' timeout_seconds='30'>
        <method_context>
          <method_environment>
            <envvar name='SGE_ROOT' value='/grid/sge'/>
            <envvar name='SGE_QMASTER_PORT' value='21636'/>
            <envvar name='SGE_CELL' value='default'/>
          </method_environment>
        </method_context>
      </exec_method>
      <exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'/>
      <property_group name='startd' type='framework'>
        <propval name='ignore_error' type='astring' value='signal'/>
      </property_group>
    </instance>
    <stability value='Unstable'/>
    <template>
      <common_name>
        <loctext xml:lang='C'>Sun Grid Engine - QMaster service</loctext>
      </common_name>
      <documentation>
        <manpage title='sge_qmaster' section='8M' manpath='/grid/sge/man'/>
      </documentation>
    </template>
  </service>
</service_bundle>

NOTE: Service does not depend on bootstrap file due to outstanding bugs in SMF. Instead startup script might check it's presence and exit with $SMF_EXIT_ERR_CONFIG.

5.1.3 Interfaces

Command administrator can use to control, query or customize services:
svccfg(1M)
svcadm(1M)
svcs(1)
svcprop(1)
To import the service manifests during the installation either root has to do it or the users has to possess appropriate profile (Solaris Management/Operator or just or custom profile see 2.1.4). Some for managing the services with svcadm. For SMF to be enabled in SGE installation we will require root.

5.1.4 Other Requirements

None

5.2 Component Sun Grid Engine

5.2.1 Overview

Sun Grid Engine consists of several daemons that will now be controlled by SMF framework on Solaris 10+.

5.2.2 Functionality

The inst_sge script will prompt always for the cluster name and this name will be used as an instance name for the service that will be imported to the SMF repository. If installation is done on other OS/version or -nosmf is provided to the script, SMF will not be used. installation will remain the same as in the previous version (except for new features questions). Installation will check if such service instance already exists and will require a new name.

Inside the installation SMF support function sge_smf.sh is called. This command is located in the util/sgeSMF directory. This command wraps SMF interfaces and publishes just a simple client interface.

sge_smf.sh register|unregister|supported|help

The options of this command is described in this component interface section.

The sge_smf.sh command is designed to be callable inside the inst_sge script as well as a standalone command.

SGE

The inst_sge changes
The inst_sge will be extended to include the Enter unique cluster name dialog and will call SMF functions for registering the service, when SMF supported system is detected.

The new util/sgeSMF/sge_smf.sh client interface added
The sge_smf command script will be added. Please see the functionality section.

The new util/sgeSMF/qmaster_template.xml file
Template for qmaster service manifest.
The new util/sgeSMF/shadowd_template.xml file
Template for shadowd service manifest.
The new util/sgeSMF/bdb_template.xml file
Template for BDB server service manifest.

The new util/sgeSMF/execd_template.xml file
Template for execd service manifest.

In addition auto_install must handle cluster name.

QMASTER

svc:/application/sgeqmaster:<SGE_CLUSTER_NAME>

Since default HA is ensured by configuring optional shadowd that take over its functionality, we should not automatically restart QMASTER service. In real HA availability scenarios (deployment in Sun Cluster) it's also not desired to do the restarting as well, in this case SC is responsible for restarting the service.

OLD (read 6.1) BEHAVIOUR with RC scripts:

sgemaster stop, qconf -km, kill -15 Correct shutdown, service does not start

kill -9 Incorrect shutdown, service does not start

reboot Service starts

SMF BEHAVIOUR:

NOTE: kill -9 will no longer shutdown qmaster, SMF will restart it.

svcadm disable -t qmaster:<SGE_CLUSTER_NAME> The correct way in SMF to stop the service, without turning off automatic startup after reboot. Service is correctly stopped.

Other old interfaces still can be used as they simulating the old behaviour:

sgemaster stop, qconf -km, kill -15 Correct shutdown, SMF qmaster handles the SIGTERM and temporary disables the service instance – SAME

kill -9 Incorrect shutdown, SMF detects interrupted service and restarts the service – DIFFERENT

reboot Service starts

SHADOWD

svc:/application/sge/shadowd:<SGE_CLUSTER_NAME>

Uses same scripts and functionality as QMASTER, logic is unchanged takes over if it detects no qmaster is alive.

EXECD

svc:/application/sge/execd:<SGE_CLUSTER_NAME>

OLD BEHAVIOUR with RC scripts:

sgeexecd stop, qconf -kej Correct shutdown, service does not start

qconf -ke, kill -15 Correct shutdown, service does not start

kill -9 Incorrect shutdown, service does not start

reboot Service starts

SMF BEHAVIOUR:

Due to the fact that in 6.1 shepherds are part of the execd contract, when execd is killed service remains online until last job finished. We need to implement a new behavior (see below). To do this we need to use both libcontract to start shepherds in new contract on SMF supported systems and libscf to distinguish between kill -15 and qconf -ke <host> scenarios. Such behavior is the desired and correct from the SMF point of view.

NOTE: kill -9 will no longer shutdown execd, SMF will restart it.

Once sgeexecd is the only service in the contract we can have this desired behavior:

svcadm disable -t execd Correct shutdown, jobs are NOT terminated

sgeexecd stop Correct shutdown, detects if using SMF and calls svcadm disable -ts, job shepherds are terminated, jobs NOT terminated

kill -9 Incorrect shutdown, SMF detects interrupted service and restarts the service DIFFERENT (NEW behavior)

kill -15, qconf -ke/-kej Correct shutdown, we need to use libscf to directly change the service state to temporary disabled

reboot Service starts

DBWriter

svc:/application/sge/dbwriter:<SGE_CLUSTER_NAME>

After installation DBWriter will now always be started. And the java process ReportingDBWriter will be the only process running in the contract after the startup.

SMF will now restart the DBWriter if it detects the it does not run unless sgedbwriter stop or svcadm disable -t dbwriter:<SGE_CLUSTER_NAME> was issued.

The inst_dbwriter changes
The inst_dbwriter will be extended to call SMF functions for registering the service, when SMF supported system is detected. As cluster name will be reused for the SGE installation. User will not have a chance to select it, it will be always read from $SGE_ROOT/$SGE_CELL/common/cluster_name file

The new dbwriter/util/sgeSMF/dbwriter_template.xml file
Template for dbwriter service manifest.

Berkeley RPC server

svc:/application/sge/bdb:<SGE_CLUSTER_NAME>

After installation BDB will now always be started.

SMF will now restart the BDB if it detects the it does not run unless sgebdb stop or svcadm disable -t bdb:<SGE_CLUSTER_NAME> was issued.

The new util/sgeSMF/bdb_template.xml file
Template for bdb service manifest

5.2.3 Interfaces

SMF will be automatically used on Solaris 10+ systems, unless -nosmf is provided as an argument to the installation scripts. If SMF is used svcadm, svcs, etc. commands should be used to enable/disable/customize SGE services.

DBWriter

DBwriter startup will be changed, so that only the java process remains running. Startup script sgedbwriter will incorporate dbwriter.sh options and dbwriter.sh will become just a wrapper to the sgedbwriter. Both will now have the same interface:

usage: sgedbwriter [-debug] [-debug_port <port>] [print <setting>] [-h] [start|stop]
    start   start the dbwriter as background process (default)
    stop    stop the dbwriter

    -debug              start the dbwriter in in debug mode
    -debug_port <port>  port for debugging (default 8000)

    print   dbwriter setting is printed to stdout.
            The following settings are available:

               pid_file   print the default pid file
               log_file   print the default log file
               spool_dir  print the default spool directory
    -h      this help text is printed

All other functionality and user experience should be unchanged.

DBWriter starup logic should be further improved, so we can check that the service is actually running before running from the SMF start method.

The util/sgeSMF/sge_smf.sh client

The sge_smf.sh command might offer to enable / disable qmaster|execd|.. services as an alternative to the svcadm enable/disable command.

Will provide register / unregister to import/delete service manifest to/from the repository and supported to query if the system is capable of using SMF.

5.2.4 Other Requirements

None