Changes between Version 39 and Version 40 of Orio


Ignore:
Timestamp:
06/12/08 01:49:15 (15 years ago)
Author:
hartono
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Orio

    v39 v40  
    151151 def build {                                                                                            
    152152   arg command = 'gcc';                                                                                  
    153    arg options = '-O0';                                                                                  
     153   arg options = '-O3';                                                                                  
    154154 }                                                                                                      
    155155 def performance_params {                                                                               
     
    183183}}} 
    184184 
    185 The tuned application in the given example is the same AXPY-4 used in the earlier subsection. In this example, the goal of the tuning process is to determine the most optimal value of the unroll factor parameter (represented as variable UF) for two distinct problem sizes: N=1K and N=10M. The unroll factor values under consideration extends over integers from 1 to 32, inclusively. 
     185The tuned application in the given example is the same AXPY-4 used in the earlier subsection. The goal of the tuning process is to determine the most optimal value of the unroll factor parameter for different problem sizes. The code located in the `PerfTuning` module body section defines the ''tuning specifications'' that include the following four definitions: 
     186 
     187 * ''build'': to specify all information needed for compiling and executing the optimized code 
     188 * ''performance_params'': to specify values of parameters used in the program transformations 
     189 * ''input_params'': to specify sizes of the input problem 
     190 * ''input_vars'': to specify both the declarations and the initializations of the input variables 
     191 
     192So in this example, the transformed AXPY-4 code is compiled using GCC compiler with the -O3 option to activate all its optimizations. The unroll factor values under consideration extends over integers from 1 to 32, inclusively. The AXPY-4 computation is tuned for two distinct problem sizes: N=1K and N=10M. Also, all scalars and arrays involved in the computation must be declared and initialized in the tuning specifications to enable the performance testing driver to empirically execute the optimized code.  It is to be noted that the ''static'' and ''dynamic'' keywords provide guidance to the performance testing driver as it allocates memory space for the declared arrays. 
    186193 
    187194Because of the huge search space, a proper search heuristic becomes a critical component of an empirical tuning system. Hence, in addition to an exhaustive search and a random search, two effective and practical search heuristic strategies have been developed and integrated into the Orio’s search engine. These include the ''Nelder-Mead Simplex'' method and ''Simulated Annealing'' method.