80 | | The tuning specifications, written by users in the form of annotations, are parsed and used by Orio to guide the search and tuning process. These specifications include important information such as the underlying machine characteristics, the used compilers, the search strategy, the transformation parameters, the input data size, and so on. |
| 80 | The tuning specifications, written by users in the form of annotations, are parsed and used by Orio to guide the search and tuning process. These specifications include important information such as the used compilers, the search strategy, the program transformation parameters, the input data sizes, and so on. |
192 | | So in this example, the transformed AXPY-4 code is compiled using GCC compiler with the -O3 option to activate all its optimizations. The unroll factor values under consideration extends over integers from 1 to 32, inclusively. The AXPY-4 computation is tuned for two distinct problem sizes: N=1K and N=10M. Also, all scalars and arrays involved in the computation must be declared and initialized in the tuning specifications to enable the performance testing driver to empirically execute the optimized code. It is to be noted that the ''static'' and ''dynamic'' keywords provide guidance to the performance testing driver on how it should allocate memory space for the declared arrays. |
193 | | |
194 | | Provided the fact that performance tuning is performed for each different problem size, the number of generated programs is therefore equivalent to the number of distinct combinations of input problem sizes. So, there are two generated program outputs for the AXPY-4 example. Using the default file naming convention, `_axpy_N_1000.c` and `_axpy_N_10000000.c` output files represent the outcomes of Orio optimization process for input sizes N=1K and N=10M, respectively. |
195 | | |
196 | | === Parameter Space Exploration Strategies === |
| 192 | So in this example, the transformed AXPY-4 code is compiled using GCC compiler with the -O3 option to activate all its optimizations. The unroll factor values under consideration extends over integers from 1 to 32, inclusively. The AXPY-4 computation is tuned for two distinct problem sizes: N=1K and N=10M. Also, all scalars and arrays involved in the computation are declared and initialized in the tuning specifications to enable the performance testing driver to empirically execute the optimized code. It is to be noted that the ''static'' and ''dynamic'' keywords provide guidance to the performance testing driver on how it should allocate memory space for the declared arrays. |
| 193 | |
| 194 | As discussed before, Orio performance tuning is performed for each different problem size. The number of generated programs is therefore equivalent to the number of distinct combinations of input problem sizes. So, there are two generated program outputs in the AXPY-4 example. Using the default file naming convention, `_axpy_N_1000.c` and `_axpy_N_10000000.c` output files represent the outcomes of Orio optimization process for input sizes N=1K and N=10M, respectively. |
| 195 | |
| 196 | === Selecting Parameter Space Exploration Strategy === |