Version 114 (modified by norris, 12 years ago) (diff)



Orio: An Annotation-Based Empirical Performance Tuning Framework


Orio is an extensible annotation system, implemented in Python, that aims to improve both performance and productivity by enabling software developers to insert annotations into their source code (in C or Fortran) that trigger a number of low-level performance optimizations on a specified code fragment. The tool generates many tuned versions of the same operation using different optimization parameters, and performs an empirical search for selecting the best among multiple optimized code variants.


Orio 0.2.1 (alpha) (Open-source license)

Development version (unstable) can be checked out anonymously with:

 svn co

Contact Boyana Norris if you wish to contribute to Orio development.


Structure of Orio Framework

The picture shown below depicts at a high level the structure and the optimization process of the Orio framework.

As the simplest case, Orio can be used to speed up code performance by performing a source-to-source transformation such as loop unrolling and memory alignment optimizations. First, Orio takes a C code as input, which contains syntactically structured comments that express various performance-tuning directives. Orio scans the annotated input code and extracts all annotated regions. Each annotation region is then passed to the code transformation modules for potential optimizations. Next, the code generator produces the final code with various optimizations being applied.

Furthermore, Orio can also be used as an automatic performance tuning tool. The system uses its code transformation modules and code generator to generate an optimized code version for each distinct combination of performance parameters. And then, the optimized code version is empirically executed and measured for its performance, which is subsequently compared to the performances of other previously tested code variants. After iteratively evaluating all code variants under consideration, the best-performing code is generated as the final output of Orio.

Because the space of all possible optimized code versions can be exponentially large, an exhaustive exploration of the search space becomes impractical. Therefore, several search heuristics are implemented in the search engine component to find a code variant with near-optimal performance.

The tuning specifications, written by users in the form of annotations, are parsed and used by Orio to guide the search and tuning process. These specifications include important information such as the used compilers, the search strategy, the program transformation parameters, the input data sizes, and so on.

Tools Using Orio

  • Pluto -- An automatic parallelizer and locality optimizer for multicores
  • PrimeTile -- A parametric multi-level tiler for imperfect loop nests


  • Albert Hartono, Boyana Norris, and Ponnuswamy Sadayappan. Annotation-based empirical performance tuning using Orio. In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, Rome, Italy, May 25-29, 2009. (Preprint ANL/MCS-P1556-1008, bib)
  • Albert Hartono, Muthu Manikandan Baskaran, Cédric Bastoul, Albert Cohen, Sriram Krishnamoorth, Boyana Norris, J. Ramanujam, and P. Sadayappan. PrimeTile?: A parametric multi-level tiler for imperfect loop nests. In Proceedings of the 23rd International Conference on Supercomputing, June 8-12, 2009, IBM T. J. Watson Research Center, Yorktown Heights, NY, USA, 2009. Also available as Tech. Report OSU-CISRC-2/09-TR04. (bib)
  • Boyana Norris, Albert Hartono, Elizabeth Jessup, and Jeremy Siek. Generating empirically optimized composed matrix kernels from MATLAB prototypes. In Proceedings of the International Conference on Computational Science 2009, 2009. Also available as Preprint ANL/MCS-P1581-0209. (bib)
  • Boyana Norris, Albert Hartono, and William Gropp. "Annotations for Productivity and Performance Portability," in Petascale Computing: Algorithms and Applications. Computational Science. Chapman & Hall / CRC Press, Taylor and Francis Group, 2007. (Preprint ANL/MCS-P1392-0107, bib)