Spooling framework

Idea

Spooling is done through a spooling framework, that can have different implementations, e.g. spooing in ascii files, in a database ...

In a first step, spooling for monitoring and accounting is done in a separate event client subscribing a certain number of object types and simply spooling them through the spooling framework.

Qmaster still spools its own ascii files. If spooling framework proves to be stable, switch qmaster to use the spooling framework and let the Grid Engine admin decide, which spooling type to use.

If qmaster is set to spool into database, and a common production and reporting database is to be used, the event client is not needed.



Spooled Objects – current implementation

One implementation for each object type – for the reading of most objects a common function call read_object is used.

Object

Implementation

Structure

Comment

Accounting

daemons/qmaster/job_exit.c,

clients/qacct/qacct.c

Ascii file, one line per record, fixed delimiter

Nothing to do. The same information can come from spooling with history.

Calendar

common/read_write_cal.c

Ascii file per object, one whitespace separated name/value per line


Checkpoint Environment

common/read_write_ckpt.c

Ascii file per object, one whitespace separated name/value per line

sublist: queues, only names, could be stored as string

Cluster configuration

common/rw_configuration.c

Ascii file per object, one whitespace separated name/value per line

Probably merge with host objects

Complex

common/sge_complex.c

Ascii file per complex, one line per complex attribute, whitespace separated fields

Need rules for spooling of complex attributes. On/Off. Min,Max,Avg in a certain interval.

History

common/complex_history.c

Directory for hosts and queues, one file per timestamp, complex file format

Nothing to do. The same information can come from spooling with history.

Host

common/read_write_host.c

Ascii file per object, one whitespace separated name/value per line

Admin and submit hosts only contain one attribute, the name

Admin-/Exec-/Submit- hosts are different objects. Should be merged into one object.

Hostgroup

common/read_write_host_group.c


Not active

Job

daemons/common/read_write_job.c

Directory structure, multiple binary files (cull packing buffer)

Job script is stored separately


Manager

Operator

daemons/qmaster/read_write_manop.c

Ascii files, one line per user name

Should better be attribute of a user object

Messages


Ascii files, one line per record, fixed delimiter

No real objects at the moment. But each message has a structure well suited for storage in database tables.

Parallel Environment

common/read_write_pe.c

Ascii file per object, one whitespace separated name/value per line

sublist: queues, only names, could be stored as string

Project

common/read_write_userprj.c

Ascii file per object, one whitespace separated name/value per line

Usage and longterm usage are sublists. Stored as name/values pairs: cpu, mem, io, finished jobs. Could also be stored as single attributes.

Queue

common/read_write_queue.c

Ascii file per object, one whitespace separated name/value per line

Qtype is stored as bitfield, spooled as list of type identifiers

sublists: thresholds (name/value pairs), owner (string list), user (string list), xuser (string list), subordinates (string list), complexes (string list), complex_values (name/value pairs), projects (string list), xprojects (string list)

Sharetree

common/sge_sharetree.c

One ascii file, references by node ids within the file


User

common/read_write_userprj.c

Ascii file per object, one whitespace separated name/value per line, special format for project related data


Usermapping

common/read_write_ume.c


Not active

Userset

common/read_write_userset.c

Ascii file per object, one whitespace separated name/value per line




Implementation

Types of spooling

Spooling is done in a certain spooling context.

A spooling context defines, how objects are spooled.

Multiple spooling contexts can be used within one process.

Examples for spooling types/destinations:

Further information stored in a spooling context:

Spooling of sublists

Many Grid Engine object types contain sublists.

In the current implementation, these hierarchical data structures are stored in different ways:



For the new implementation, we'll have to differentiate between file based formats and database storage.

For file based storage, we should use the following strategies:

For database storage, we should use the following strategies:

reference type

current implementation

new filebased

new database

referencing objects

object id from cull

object id from cull

object id, either from cull or database internal serial number

list of references

string list or cull sublist

string list

mapping table

name/value pairs

string list or cull sublist

string list

mapping table with value

subordinate objects

special format or spool in cull binary format

break up such hierarchies (e.g. possible in the user object) or store data in additional files or directory structure and reference these files

store them in additional tables and make them reference their superior object

job hierarchy

directory hierarchy

directory hierarchy

subordinate objects reference superior objects


Spooling policies dependent on component

Current implementation

In the current implementation we have different spooling policies dependent on the component that does spooling.

Main spooling component is the qmaster.

But also execd has spooling of jobs and related information, e.g. queues, or parallel environment information.

The related information reflects the status of the spooled object at the time the job was delivered to execd.

It is also possible that execd does spool other attributes of jobs than does qmaster.

Suggestions for a new implementation

Different approaches are possible to address this issue. The following will discuss some ideas.

Multiple writing instances to one global database

All daemons use a common database. The execds can write directly to the database. Qmaster is notified about changes by the database.

Pros:

Cons:

Probably not an option for the near future.

Restrict to file spooling in execd

Each execd has its own area for spooling, usually file based, either on a local disk (recommended) or via NFS mount.

Use formats that allow the spooling of hierarchical data, i.e. either cull binary format or XML format.

As execd spools information in a different way (not all / other attributes as qmaster, different strategy for sublists), the spooling implementation has to provide means to overwrite the spooling strategies defined as default for certain object types, or 2 spooling strategies have to be defined for object types.

Pros:

Cons:

Cull enhancements

Definition of attributes

Cull definition will have to contain information, which fields have to be spooled and how sublists are spooled.

Replace the many similar definitions for same object types by a combination of flags. Example:

We have now 14 definitions for the string datatype (SGE_STRING, SGE_STRINGH, SGE_STRING_HU, SGE_KSTRING, ...)



A list element definition like

SGE_KULONGH(JB_job_number)

could be replaced by

SGE_ULONG(JG_job_number, HASH | UNIQUE | SPOOL | QIDL_K)

or

SGE_LIST_ELEMENT(JG_job_number, ULONG | HASH | UNIQUE | SPOOL | SHOW | QIDL_K)



A keyword DEFAULT could be used, if no special settings are done for a type.



Descriptor field mt has lots of free space (currently only uses 4 bit for the data types from a (32 bit) integer) that could hold the following additional information:

Probably we should use a prefix like CULL_ or SGE_ to ensure uniqueness, e.g. CULL_HASH instead of HASH.

Tracking of changed attributes

To be able to interface a database using mechanisms like SQL, each object must know, which attributes have changed. Otherwise, the whole object has to be spooled on each spooling function call, even if only few attributes have been changed or the object hasn't been changed at all.

This could be achieved by making a struct arround the lMultiType enum type and reserving „one bit“ for the changed attribute.

Or by adding a bitfield containing this information to the lListElem data type – this would be less memory consuming.

Attribute names

A set of attribute names are generated using the NAMEDEF macros for each object type.

These attribute names have very limited use in the current implementation – they are only used for debugging purposes (lWrite* function calls).

For spooling, information output and configuration changes we also need attribute names. These names are at the moment hardcoded in the spooling, output and parsing functions.

It would be better, to extend the existing NAMEDEF macros to create struct objects containing both the internal attribute name and an attribute name to be used for the other purposes.

Functions

create_spooling_context

free_spooling_context



spool_prepare

spool_commit



spool_object

spool_attribute



Installation issues

First step:

Provide an install_monitoring script to setup the event client and its spooling configuration.

Second step:

In qmaster install, decide which spooling type to use, with type specific further actions (for SQL database, query user for parameters and test the database).


Implementation proposal

The implementation can be done in separate steps that can each face thorough testing. Time estimations are netto times and include documentation and testing.

task

est. time [weeks]

implement the suggested cull object definition changes

2

implement tracking of attribute changes

2

implement file based spooling. Restrict to the following text file formats:

  • one record per file, name/value pairs per line

  • fixed delimiters for objects and attribute values

  • XML

3

make a compile time switch that will make the new spooling functions used by qmaster for some selected object types. Only for test purposes.

1

implement database storage

8

create an event client that subscribes all events for all object types and spools them to a database

2

do extensive tests with qmaster using some of the new spooling functions to files and the event client attached, continue tests during the next phases.

2

Sum essential steps

20

make qmaster and execd use the new spooling framework (compile time option), test different spooling strategies

4

make new spooling framework the default, create means to configure spooling strategies during the installation process

2

create install_monitoring that will install the event client separately

1

create means to update the database structure, backup and purging of outdated information

2

build clients that use the database as source of information instead of qmaster (qhost, qstat, qacct)

2

change qconf and qalter to use the new spooling framework for reading information and for creating and processing the data to be configured.

2

Sum additional steps

13