Changes between Version 31 and Version 32 of Orio


Ignore:
Timestamp:
06/11/08 19:21:28 (15 years ago)
Author:
hartono
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Orio

    v31 v32  
    8282== User Guide == 
    8383 
    84 As previously discussed , Orio has two main functions: a ''source-to-source transformation tool'' and an ''automatic performance tuning tool''. In this section, simple examples are provided to offer users the quickest steps to begin using Orio. 
     84As previously discussed , Orio has two main functions: a ''source-to-source transformation tool'' and an ''automatic performance tuning tool''. In the following subsections, simple examples are provided to offer users the quickest way to begin using Orio. 
    8585 
    8686=== Using Orio as a Source-to-Source Code Transformation Tool === 
     87 
     88Orio has several code transformation module that have already been implemented and are ready to use. One of the transformation modules is ''loop unrolling'', a loop optimization that aims to increase register reuse and to reduce branching instructions by combining instructions that are executed in multiple loop iterations into a single iteration. The below sample code demonstrates how to annotate an application code with a simple portable loop unrolling optimization, where the unroll factor used in this example is four. The original code to be optimized in this example is commonly known as AXPY-4, which is an extended version of the AXPY Basic Liner Algebra Subprogram. 
     89 
     90{{{ 
     91/*@ begin Loop (  
     92    transform Unroll(ufactor=4)  
     93    for (i=0; i<=N-1; i++) 
     94      y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; 
     95) @*/ 
     96for (i=0; i<=N-1; i++) 
     97   y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; 
     98/*@ end @*/ 
     99}}} 
     100 
     101In order to apply loop unrolling to the above code, run the following Orio command (assuming that the code is stored in the file `axpy4.c`). 
     102 
     103{{{ 
     104% orcc axpy4.c 
     105}}} 
     106 
     107By default, the transformed output code is written to the file `_axpy4.c`. Users can specify the name of the output file using the command option '`-o <file>`'. Below is how the output code looks like. 
     108 
     109{{{ 
     110/*@ begin Loop (  
     111    transform Unroll(ufactor=4)  
     112    for (i=0; i<=N-1; i++) 
     113      y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; 
     114) @*/ 
     115#if ORIGCODE 
     116  for (i=0; i<=N-1; i++) 
     117    y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; 
     118#else 
     119  for (i=0; i<=N-4; i=i+4) { 
     120    y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; 
     121    y[i+1] = y[i+1] + a1*x1[i+1] + a2*x2[i+1] + a3*x3[i+1] + a4*x4[i+1]; 
     122    y[i+2] = y[i+2] + a1*x1[i+2] + a2*x2[i+2] + a3*x3[i+2] + a4*x4[i+2]; 
     123    y[i+3] = y[i+3] + a1*x1[i+3] + a2*x2[i+3] + a3*x3[i+3] + a4*x4[i+3]; 
     124  } 
     125  for (; i<=N-1; i=i+1)  
     126    y[i] = y[i] + a1*x1[i] + a2*x2[i] + a3*x3[i] + a4*x4[i]; 
     127#endif 
     128/*@ end @*/ 
     129}}} 
    87130 
    88131=== Using Orio as an Automatic Performance Tool ===