Version 10 (modified by hartono, 15 years ago) (diff)


Performance Tuning Specifications of Orio

In addition to the quick start guide presented in the Orio's main webpage, we provide this documentation to put more details on tuning specifications so that users can fully benefit the automatic tuning feature of Orio.


Below is a concrete illustration of how the tuning specifications of Orio look like.

def build {                                                                                           
  arg command = 'icc';                                                                                 
  arg options = '-fast -parallel';                                                                                 

let NUM_REGS = 128;
let L1_CACHE_SIZE = 64*(2**20);
def performance_params {
  param TileSize1[] = [1,32,64,128,256,512];
  param TileSize2[] = [1,32,64,128,256,512];
  param UnrollFactor1[] = range(1,32);
  param UnrollFactor2[] = range(1,32);
  constraint RegisterCapacity = UnrollFactor1 * UnrollFactor2 * 9 <= NUM_REGS;
  constraint L1Tiling = TileSize1 * TileSize2 <= L1_CACHE_SIZE;

def input_params {
  let SIZES = [100,1000,2000,4000,8000];
  param M[] = SIZES;
  param N[] = SIZES;
  constraint SquareShape = M == N;

def input_vars {
  decl dynamic double X[M] = 0;
  decl dynamic double Y[N] = random
  decl static double A[M][N] = random;
  decl double C = random;

Structure of Tuning Specifications

The tuning specifications of Orio simply consist of a sequence of definition statements. Every definition statement contains a series of auxiliary statements, which can be categorized into five different types of statements as follows.

  1. Let statement has the main purpose of storing a temporary data into a variable that may be reused multiple times by other successive statements. To be noted that the location of a let statement need not be inside the body of a definition statement, as seen in the above example.
  2. Argument statement is used to collect specific information from the Orio user about the pertinent tuning components. One example shown above is the command and options arguments (in the build definition), of which role is to tell Orio about how to compile and execute the optimized code.
  3. Parameter statement is used to assign a range of values to the tuning parameters, which can be either performance parameters or input problem parameters. The symbol [] must be placed after the parameter name to indicate that the parameter has multiple values to be considered.
  4. Constraint statement aims primarily to prune off uninteresting portion of the space of parameter values so that the search is concentrated on the search space highly possible to yield high quality solutions. Some examples are the RegisterCapacity and L1Tiling constraints. Moreover, constraint statement also allows users to define the shape of the input arrays such as the SquareShape constraint, which can be found in the earlier example.
  5. Declaration statement informs the performance testing driver about all input scalars and arrays required to be declared and initialized. It is to be noted that the static and dynamic keywords provide guidance to the driver on how it should allocate memory space for the declared arrays.

Declarations and Initializations of Input Variables

As just mentioned before, all input variables involved in the core computation must be specified in the input_vars definition statement so that the performance testing driver can construct code for both the declarations and the initializations of the input variables. However, declarations and initializations of input variables can turn complicated, especially for multidimensional arrays with unique properties such as upper/lower triangular matrices and anti-symmetric matrices. As a consequence, Orio offers three alternatives to its users on how input variables can be declared and initialized accurately.

  1. Both declarations and initializations are generated by the driver.
    def input_vars {
      decl static double X[N][N] = 0;
  2. Declarations are generated by the driver, whereas initializations are written by the user. To be noted that all the declaration statements must have no initial assigned values.
    def input_vars {
      decl static double X[N][N];
      arg init_file = 'init_code.c';
  3. Both declarations and initializations are written by the user.
    def input_vars {
      arg decl_file = 'decl_code.h';
      arg init_file = 'init_code.c';

The following is the content of the decl_code.h file.

double X[N][N];

And the code of the init_code.c file is displayed below.

void init_input_vars() {
  int i,j;
  for (i=0; i<=N-1; i++)
    for (j=0; j<=N-1; j++)
      if (i < j)
        X[i][j] = (i+j)%10 + 1;
      else if (i == j)
        X[i][j] = 1;
        X[i][j] = 0;

One prerequisite of a user-provided initialization program is that the input variables’ initializations must be enclosed inside a function named init_input_vars; otherwise, Orio will report an error message.

Overriding the Performance Testing Code

Under construction