Blue Gene/P - NCF Facilities
- RUG-CIT Blue Gene/P facility
- Acceptable use
- ASTRON LOFAR/CIT Blue Gene system description
- Blue Gene system usage
- Program development and compilers
5.1 Compiling programs
5.2 Compilation example
- Job submission: Running Jobs
- Programming utilities
- Documentation and References
- Local contact person
RUG-CIT Blue Gene P facility
Large scale parallel computer facilities are playing an important role in exploring new frontiers in scientific research.
For this purpose the Center for Information Technology of the University of Groningen offers a unique facility on the ASTRON-LOFAR Blue Gene P.
The main task of the Blue Gene is the data reduction of the ASTRON LOFAR radiotelescope. A certain part of the Blue Gene, separated from the Lofar processing racks, provides the scientific programming environment.
With the help of a funding by the Dutch National Computer Facilities Organization (NCF) the Lofar Blue Gene P has been extended with an additional 2048 core CPU block intended for scientific use by the Dutch research community.
Requests for compute time on the Blue Gene can be done via the NCF by means of a written request explaining the research activity and motivating the use of the Blue Gene for the type of research. Requests will be evaluated an possibly granted by both the NCF as well as the CIT. The limitations issued by IBM at the time of installation of the Blue Gene P apply to the request.
The user is also bound to the “Acceptable Use Policy of the University of Groningen for University Computer Systems”.
ASTRON LOFAR/CIT Blue Gene system description
The Blue Gene P system consists of 3072 compute nodes, each with 2 GB of memory. One compute node consists of 4 cores, providing a total of 12288 cores. A number of “I/O nodes” are used for the communication with the compute nodes and perform the I/O operations between compute nodes and data servers.
Although the system can be used as one massive compute block it has been divided in logical “partitions”. These partitions communicate with the frontend of the system via their I/O nodes and can be used separately. Jobs submitted to disjoint partitions can be run at the same time.
The LOFAR data processing operates almost continuously and does not take part in any scheduling or job submission of the scientific part.
The part available for the scientific users operates as a batch processing system: Jobs are submitted to a queue are matched with a compute block when a suitable partition becomes available.
The scheduling system tries to ensure efficient usage of the machine. Note that the Blue Gene is a real batch processing system which implies that jobs can only start when others have been terminated. On the compute nodes there runs no operating system, in other words, only one process can run per processor at the same time. The CIT DNA queueing system on the BGP controls job submission and scheduling. The DNA (Dynamic Node Allocation) system controls access to the partitions via a number of “queues”.
Queues contain a predefined maximum number of cores and maximum compute time. Users may be allowed to submit jobs to more than one queue. Once a job has been submitted to a queue the DNA system schedules the job and will run it on a partition which can supply the number of requested cpu’s when they become available.
Blue Gene system usage
Users login via a so called “Frontend server”. Access is done via a secure authentication mechanism. The number of compute hours that is granted to a user by NCF is administered by the DNA system. Once a user exceeds his quota he is no longer allowed to submit jobs to the system however he is still able to login.
Program development and compilers
5.1 Compiling programs
For parallel programming applications the IBM xl* compilers can be used. Although the gcc compilers are also available on the system the xl* compilers are the preferred compilers as they generate optimized code for the BG. The BG XL compilers are available for the C, C++ and Fortran90 programming languages.
For example: The automatic use of the xl compiler for the C program can be established by putting the line : ’export CC=mpixlc’ in the users home directory .bashrc profile, after this the IBM version mpi C compiler will be used when “CC” is called in the Makefile.
5.2 Compilation example
A short example will show the procedure how to compile a C program and how to submit a job to the BG.
Suppose there is a program called ’bgptest.c’ located in a directory which has to be compiled on the frontend
Login on the frontend. Compile program ’bgptest’
mkdir -p ~/mpi/bin # Create a bin directory for BG programs
mpixlc -o ~/mpi/bin/bgptest bgptest.c # Compile
After compilation the code ’bgptest’ exists in the local mpi/bin directory.
Job submission: Running jobs
Scientific users submit jobs by means of the “DNA” job submission system.
The DNA system consists of several programs for job submission and inspection. A spooling and scheduling system schedules the jobs for execution at a later time.
Almost all commands start with the two letters “bg”.
The program “bgsubmit” is used to submit jobs to the BG system and the DNA spooler. For an overview of all options type : man bgsubmit
Example : run the program ’bgptest’ on 32 compute nodes in virtual node mode ( using all 4 cores), with a 10 min. maximum execution time. Results are stored in the directory output :
bgsubmit -t 10 -m VN -od ~/output -exe ~/mpi/bin/bgptest -a "arg1 arg2 ..."
Commands for inspecting jobs are:
gbgjobinfo < JobID > : Show the status of the job with id. JobID
bgjobs : Show all job in the system
bgbusy : Show running jobs and the system load
For an overview of the DNA system and its commands type :
or : man bgsubmit
The command : bgcommands lists all of the DNA commands.
Deleting a job is done by means of:
bgkilljob < JobID > : Delete job with id. JobID
A number of programming utilities are available for program development. The Blue Gene is a distributed memory system, therefore the MPI message passing interface is used. Other message passing libraries are not available.
For documentation of the MPI interface see: http://www.mpi-forum.org/docs.
Documentation and References
Command bgpinfo on the frontend will provide the most recent information about the state of the system.
Online manual pages are available for the DNA system, compilers and programming tools.
MPI documentation : http://www.mpi-forum.org/docs
Specific IBM software- and programming documentation is available in the “IBM Red Books for the Blue Gene ”.
These can be found at : http://www.redbooks.ibm.com/cgi-bin/searchsite.cgi?query=blue+gene
The DNA job submission and queuing system was designed and
implemented by Arnold Meijster and Harm Paas.
|Laatst gewijzigd:||23 januari 2017 14:46|