Changes between Version 31 and Version 32 of Orio

Ignore:
Timestamp:
06/11/08 19:21:28 (15 years ago)
Comment:

--

Unmodified
Added
Removed
Modified
• Orio

 v31 == User Guide == As previously discussed , Orio has two main functions: a ''source-to-source transformation tool'' and an ''automatic performance tuning tool''. In this section, simple examples are provided to offer users the quickest steps to begin using Orio. As previously discussed , Orio has two main functions: a ''source-to-source transformation tool'' and an ''automatic performance tuning tool''. In the following subsections, simple examples are provided to offer users the quickest way to begin using Orio. === Using Orio as a Source-to-Source Code Transformation Tool === Orio has several code transformation module that have already been implemented and are ready to use. One of the transformation modules is ''loop unrolling'', a loop optimization that aims to increase register reuse and to reduce branching instructions by combining instructions that are executed in multiple loop iterations into a single iteration. The below sample code demonstrates how to annotate an application code with a simple portable loop unrolling optimization, where the unroll factor used in this example is four. The original code to be optimized in this example is commonly known as AXPY-4, which is an extended version of the AXPY Basic Liner Algebra Subprogram. {{{ /*@ begin Loop ( transform Unroll(ufactor=4) for (i=0; i<=N-1; i++) y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; ) @*/ for (i=0; i<=N-1; i++) y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; /*@ end @*/ }}} In order to apply loop unrolling to the above code, run the following Orio command (assuming that the code is stored in the file `axpy4.c`). {{{ % orcc axpy4.c }}} By default, the transformed output code is written to the file `_axpy4.c`. Users can specify the name of the output file using the command option '`-o `'. Below is how the output code looks like. {{{ /*@ begin Loop ( transform Unroll(ufactor=4) for (i=0; i<=N-1; i++) y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; ) @*/ #if ORIGCODE for (i=0; i<=N-1; i++) y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; #else for (i=0; i<=N-4; i=i+4) { y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; y[i+1] = y[i+1] + a1*x1[i+1] + a2*x2[i+1] + a3*x3[i+1] + a4*x4[i+1]; y[i+2] = y[i+2] + a1*x1[i+2] + a2*x2[i+2] + a3*x3[i+2] + a4*x4[i+2]; y[i+3] = y[i+3] + a1*x1[i+3] + a2*x2[i+3] + a3*x3[i+3] + a4*x4[i+3]; } for (; i<=N-1; i=i+1) y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; #endif /*@ end @*/ }}} === Using Orio as an Automatic Performance Tool ===