wiki:PerfexpPy

Version 10 (modified by norris, 12 years ago) (diff)

--

Python Infrastructure for Managing Performance Experiments

Performance experiments can involve multiple execution runs where parameters such as execution platform, measurement tools, methods of measurement, application parameters, and analysis techniques can vary. In order to manage the layers of complexity involved in experimental setup, execution, and post-analysis, a degree of automation is necessary at each stage of the process. A layer of abstraction is needed to hide the intricacies involved in experimental set-up and runs for varying sets of experimental parameters. This suite of software features an integrated component-based environment that automates the process of running multiple performance experiments and parameter selection of parallel scientific applications. This toolkit will enable application scientists to easily modify the experimental parameters over multiple execution runs and to selectively retrieve the data for analysis and generation of performance models.

The features and implementation are described in more detail in

Van Bui, Boyana Norris, and Lois Curfman McInnes?. An automated component-based performance experiment environment. In Proceedings of the 2009 Workshop on Component-Based High Performance Computing (CBHPC 2009), Nov. 2009. Also available as Preprint ANL/MCS-P1666-0809.

Download

(Open-source license)

Development version (unstable) can be checked out anonymously with:

svn co https://svn.mcs.anl.gov/repos/performance/perfexp perfexp

Contact Van Bui or Boyana Norris if you wish to contribute to this project.

Software Requirements

Usage

Instructions for Running Performance Scripts

Phase: Running the Performance Experiments

  • Go to the ~/perfexp/src/examples/drivers directory and specify the measurement environment and the performance collector in ExperimentDriver.py.

Supported measurement environments include: aix, cobalt, xeon

Supported performance collectors include: gprof, ltimer, notimer, perfsuite, tau

For example, to specify to run measurements on a xeon system using tau for performance collection, add to ExperimentDriver.py the following lines:

  from me.platforms.xeon import Generic

  from me.tools.tau import Collector as TAUCollector

  measurementEnvironment = Generic()

  dataCollector = TAUCollector()
  • Go to the ~/perfexp/src/me/platforms directory and modify the measurement environment file to specify the performance collector in moveData routine:

For example, if collecting data using TAU on a xeon system add the following line to xeon.py

DataCollector = TAUCollector()

  • Go to the directory ~/perfexp/src/examples and specify parameters in the params.txt file.

workdir -- the working directory (location where files can be generated such as batch scripts or performance profiles and such).

mpidir -- directory where mpirun or mpiexec is located, be sure to specify the mpi script to run (e.g., mpidir = /usr/bin/mpirun)

cmdline -- the command line to run the application

threads -- the number of threads

processes -- the number of processes

nodes -- the number of nodes

tasks_per_node -- the number of tasks per node

pmodel -- the programming model (supported models include serial, mpi, omp, mpi:omp)

instrumentation -- type of performance instrumentation (supported instrumentation types include source and runtime)

exemode -- execution mode (supported modes include interactive and batch)

batchcmd -- the command to submit a batch job

jobname -- the job name to add to the batch script

walltime -- the walltime in hh:mm:ss to add to the batch script

maxprocessor -- the maximum number of processors

account name -- the name of the account to charge the batch job to

buffersize -- the MPI buffer size in bytes

msgsize -- the MPI message size in bytes

stacksize -- the stack size

counters -- the performance counter names (e.g. counters = PAPI_TOT_CYC P_WALL_CLOCK_TIME

commode -- the network communication mode

memorysize -- the amount of memory to request for a batch job

queue -- the queue to run the batch job in

datadir -- the directory where the performance data is stored

appname -- the application name

expname -- the experiment name

trialname -- the trial name

  • Run the script to run the performance experiment driver (located in ~/perfexp/scripts)

$perfexp examples.drivers.ExperimentDriver

Developer Notes

Performance Experiment CCA Components

A component version of the performance experiment scripts: http://trac.mcs.anl.gov/projects/cca/wiki/performance