Version 10 (modified by norris, 12 years ago) (diff) |
---|
Python Infrastructure for Managing Performance Experiments
Performance experiments can involve multiple execution runs where parameters such as execution platform, measurement tools, methods of measurement, application parameters, and analysis techniques can vary. In order to manage the layers of complexity involved in experimental setup, execution, and post-analysis, a degree of automation is necessary at each stage of the process. A layer of abstraction is needed to hide the intricacies involved in experimental set-up and runs for varying sets of experimental parameters. This suite of software features an integrated component-based environment that automates the process of running multiple performance experiments and parameter selection of parallel scientific applications. This toolkit will enable application scientists to easily modify the experimental parameters over multiple execution runs and to selectively retrieve the data for analysis and generation of performance models.
The features and implementation are described in more detail in
Van Bui, Boyana Norris, and Lois Curfman McInnes?. An automated component-based performance experiment environment. In Proceedings of the 2009 Workshop on Component-Based High Performance Computing (CBHPC 2009), Nov. 2009. Also available as Preprint ANL/MCS-P1666-0809.
Download
(Open-source license)
Development version (unstable) can be checked out anonymously with:
svn co https://svn.mcs.anl.gov/repos/performance/perfexp perfexp
Contact Van Bui or Boyana Norris if you wish to contribute to this project.
Software Requirements
- Python 2.5 or newer
- Matplotlib
Usage
Instructions for Running Performance Scripts
Phase: Running the Performance Experiments
- Go to the ~/perfexp/src/examples/drivers directory and specify the measurement environment and the performance collector in ExperimentDriver.py.
Supported measurement environments include: aix, cobalt, xeon
Supported performance collectors include: gprof, ltimer, notimer, perfsuite, tau
For example, to specify to run measurements on a xeon system using tau for performance collection, add to ExperimentDriver.py the following lines:
from me.platforms.xeon import Generic from me.tools.tau import Collector as TAUCollector measurementEnvironment = Generic() dataCollector = TAUCollector()
- Go to the ~/perfexp/src/me/platforms directory and modify the measurement environment file to specify the performance collector in moveData routine:
For example, if collecting data using TAU on a xeon system add the following line to xeon.py
DataCollector = TAUCollector()
- Go to the directory ~/perfexp/src/examples and specify parameters in the params.txt file.
workdir -- the working directory (location where files can be generated such as batch scripts or performance profiles and such).
mpidir -- directory where mpirun or mpiexec is located, be sure to specify the mpi script to run (e.g., mpidir = /usr/bin/mpirun)
cmdline -- the command line to run the application
threads -- the number of threads
processes -- the number of processes
nodes -- the number of nodes
tasks_per_node -- the number of tasks per node
pmodel -- the programming model (supported models include serial, mpi, omp, mpi:omp)
instrumentation -- type of performance instrumentation (supported instrumentation types include source and runtime)
exemode -- execution mode (supported modes include interactive and batch)
batchcmd -- the command to submit a batch job
jobname -- the job name to add to the batch script
walltime -- the walltime in hh:mm:ss to add to the batch script
maxprocessor -- the maximum number of processors
account name -- the name of the account to charge the batch job to
buffersize -- the MPI buffer size in bytes
msgsize -- the MPI message size in bytes
stacksize -- the stack size
counters -- the performance counter names (e.g. counters = PAPI_TOT_CYC P_WALL_CLOCK_TIME
commode -- the network communication mode
memorysize -- the amount of memory to request for a batch job
queue -- the queue to run the batch job in
datadir -- the directory where the performance data is stored
appname -- the application name
expname -- the experiment name
trialname -- the trial name
- Run the script to run the performance experiment driver (located in ~/perfexp/scripts)
$perfexp examples.drivers.ExperimentDriver
Developer Notes
Performance Experiment CCA Components
A component version of the performance experiment scripts: http://trac.mcs.anl.gov/projects/cca/wiki/performance