Compatibility is an important aspect of the License under which Grid Engine source code is made available. In this context, cross compatibility may have to be certified between a version of the Grid Engine software which you have enhanced and a version declared as one of Grid Engine's Reference Builds from which your modification deviates. Note, that over time there can be multiple Reference Builds representing different stable software release levels. You might intend to test compatibility with one or with multiple of those Reference Builds. A list of the currently available Reference Builds with all pertinent information can be found here.
The following describes how to test compatibility between two builds. You will need to create a binary distribution package for both builds before you start with the compatibility checking process. You will also have to make all preparations to be able to run the Grid Engine Testsuite. You will have to use the Testsuite level as defined in the Reference Build definition.
The compatibility test consist of a preparation step, a validation run of the Standard Version and multiple compatibility checks. Your changed version has to pass all tests without error and has to deliver the same results as the validation run to be considered compatible.
The Testsuite documentation is describing how this can be done. Use the Reference Build to which you want to test compatibility and run:
expect check.exp installThe testsuite will generate a default setup file "defaults.sav" in the testsuite directory. After that the testsuite will start the vi command in order that the user can edit the testsuite settings. You will be asked on which hosts you want to install a testsuite cluster and you will have to use at least 2 for the purpose of the compatibility test. Please enable the error mails by providing your e-mail address when setting up the testsuite. The testsuite will report errors by e-mail.
Run the testsuite with following command (Testsuite
start output):
Do not remove the test results of the validation run. Every testsuite
run will manipulate the results directory, so copy your validation results
before running another test. You will need to compare your validation run
results with the subsequent compatibility runs. No errors must occur during
the validation run. If you encounter errors then this might be due to network
setup problems in your cluster or similar issues. Fix those first before
you proceed. Report your problem if it persists. You cannot test compatibility
with a validation run with errors.
Use the testsuite to shutdown the cluster:
Use the testsuite to shutdown the cluster:
Use the testsuite to shutdown the cluster:
(If you are absolutely sure that your modification did not change sge_commd you may skip this step, but be aware that changes in some libraries, like for instance the zlib, may also modify the sge_commd. Carry out the test if you are not 100% sure.)
Use the testsuite to shutdown the cluster:
(If you are absolutely sure that your modification did not change any Grid Engine client binary you may skip this step, but be aware that changes in some libraries, like the GDI library, may also modify the client binaries. Carry out the test if you are not 100% sure.)
Use the testsuite to shutdown the cluster:
The Testsuite documentation
is describing how this can be done. Use the modified build.
The testsuite will generate a default setup file "defaults.sav"
in the testsuite directory. After that the testsuite will start the vi
command in order that the user can edit the testsuite settings. You will
be asked on which hosts you want to install a testsuite cluster and you
will have to use at least 2 for the purpose of the compatibility test.
Run the testsuite with following command on your modified build:
After starting the testsuite with the commandexpect check.exp all 2 category COMPATIBILITYthe testsuite should produce the following output:
===============================================================================
system version : SGE 5.3 (1) / feature: none current dir : [testsuite_root_directory]/checktree max. runlevel : day long medium short week selected runlevels : long medium short categories : COMPATIBILITY PERFORMANCE SYSTEM selected categories: COMPATIBILITY est. run time : 6 h 40 m =============================================================================== 2 test(s) available in subdir: functional 1 test(s) available in subdir: install_core_system 0 test(s) available in subdir: performance 19 test(s) available in subdir: system_tests =============================================================================== run all tests ... you have no ssh access and no root password test in directory [testsuite_root_directory]/checktree/functional/access_lists needs root access ... root access needed, please enter root password: |
After entering the root password the testsuite will start with the compatibility tests.
Following tests may cause trouble. If one of the check functions will report an error described in this table, the
error can be ignored:
Check name Check function Remarks submit_del submit_del_test A job deleted immediately after submit, may stay in delete state. A second qdel call will delete the job. This is a known problem. The testsuite provokes this behaviour and reports errors. The error message is "timeout waiting for end of all jobs"
qdel qdel_submit_delete_when_transfered See remarks for submit_del. Error message is `timeout waiting for job "X" "leeper""`
qrsh qrsh_trap This test notifies the user with "test not completely implemented" this is only a warning. The result is listet as "unsupported tests". Any other error should not pop up.