\endhtmlonly \section interface Programming Interface The basic interface is available in hwloc.h. It essentially offers low-level routines for advanced programmers that want to manually manipulate objects and follow links between them. Documentation for everything in hwloc.h are provided later in this document. Developers should also look at hwloc/helper.h (and also in this document, which provides good higher-level topology traversal examples). To precisely define the vocabulary used by hwloc, a \ref termsanddefs section is available and should probably be read first. Each hwloc object contains a cpuset describing the list of processing units that it contains. These bitmaps may be used for \ref hwlocality_cpubinding and \ref hwlocality_membinding. hwloc offers an extensive bitmap manipulation interface in hwloc/bitmap.h. Moreover, hwloc also comes with additional helpers for interoperability with several commonly used environments. See the \ref interoperability section for details. The complete API documentation is available in a full set of HTML pages, man pages, and self-contained PDF files (formatted for both both US letter and A4 formats) in the source tarball in doc/doxygen-doc/. NOTE: If you are building the documentation from a Subversion checkout, you will need to have Doxygen and pdflatex installed -- the documentation will be built during the normal "make" process. The documentation is installed during "make install" to $prefix/share/doc/hwloc/ and your systems default man page tree (under $prefix, of course). \subsection portability Portability As shown in \ref cli_examples, hwloc can obtain information on a wide variety of hardware topologies. However, some platforms and/or operating system versions will only report a subset of this information. For example, on an PPC64-based system with 32 cores (each with 2 hardware threads) running a default 2.6.18-based kernel from RHEL 5.4, hwloc is only able to glean information about NUMA nodes and processor units (PUs). No information about caches, sockets, or cores is available. Similarly, Operating System have varying support for CPU and memory binding, e.g. while some Operating Systems provide interfaces for all kinds of CPU and memory bindings, some others provide only interfaces for a limited number of kinds of CPU and memory binding, and some do not provide any binding interface at all. Hwloc's binding functions would then simply return the ENOSYS error (Function not implemented), meaning that the underlying Operating System does not provide any interface for them. \ref hwlocality_cpubinding and \ref hwlocality_membinding provide more information on which hwloc binding functions should be preferred because interfaces for them are usually available on the supported Operating Systems. Here's the graphical output from lstopo on this platform when Simultaneous Multi-Threading (SMT) is enabled: \image html ppc64-with-smt.png \image latex ppc64-with-smt.pdf "" width=\textwidth And here's the graphical output from lstopo on this platform when SMT is disabled: \image html ppc64-without-smt.png \image latex ppc64-without-smt.pdf "" width=\textwidth Notice that hwloc only sees half the PUs when SMT is disabled. PU #15, for example, seems to change location from NUMA node #0 to #1. In reality, no PUs "moved" -- they were simply re-numbered when hwloc only saw half as many. Hence, PU #15 in the SMT-disabled picture probably corresponds to PU #30 in the SMT-enabled picture. This same "PUs have disappeared" effect can be seen on other platforms -- even platforms / OSs that provide much more information than the above PPC64 system. This is an unfortunate side-effect of how operating systems report information to hwloc. Note that upgrading the Linux kernel on the same PPC64 system mentioned above to 2.6.34, hwloc is able to discover all the topology information. The following picture shows the entire topology layout when SMT is enabled: \image html ppc64-full-with-smt.png \image latex ppc64-full-with-smt.pdf "" width=\textwidth Developers using the hwloc API or XML output for portable applications should therefore be extremely careful to not make any assumptions about the structure of data that is returned. For example, per the above reported PPC topology, it is not safe to assume that PUs will always be descendants of cores. Additionally, future hardware may insert new topology elements that are not available in this version of hwloc. Long-lived applications that are meant to span multiple different hardware platforms should also be careful about making structure assumptions. For example, there may someday be an element "lower" than a PU, or perhaps a new element may exist between a core and a PU. \subsection interface_example API Example The following small C example (named ``hwloc-hello.c'') prints the topology of the machine and bring the process to the first logical processor of the second core of the machine. \include hwloc-hello.c hwloc provides a \c pkg-config executable to obtain relevant compiler and linker flags. For example, it can be used thusly to compile applications that utilize the hwloc library (assuming GNU Make): \verbatim CFLAGS += $(pkg-config --cflags hwloc) LDLIBS += $(pkg-config --libs hwloc) cc hwloc-hello.c $(CFLAGS) -o hwloc-hello $(LDLIBS) \endverbatim On a machine with 4GB of RAM and 2 processor sockets -- each socket of which has two processing cores -- the output from running \c hwloc-hello could be something like the following: \verbatim shell$ ./hwloc-hello *** Objects at level 0 Index 0: Machine(3938MB) *** Objects at level 1 Index 0: Socket#0 Index 1: Socket#1 *** Objects at level 2 Index 0: Core#0 Index 1: Core#1 Index 2: Core#3 Index 3: Core#2 *** Objects at level 3 Index 0: PU#0 Index 1: PU#1 Index 2: PU#2 Index 3: PU#3 *** Printing overall tree Machine(3938MB) Socket#0 Core#0 PU#0 Core#1 PU#1 Socket#1 Core#3 PU#2 Core#2 PU#3 *** 2 socket(s) shell$ \endverbatim \htmlonly

Portable abstraction of hierarchical architectures for high-performance computing