| 5 | |
| 6 | == Optimizations used == |
| 7 | We used PLuTo (an auto-parallelization and locality optimization tool based on polyhedral models) as a polyhedral-based code transformator. And we also extended ancc with additional modules used to perform syntactical transformations. Below are the (polyhedral and syntactic) optimizations used in this experiment. |
| 8 | |
| 9 | ''Polyhedral'' transformations (from PLuTo): |
| 10 | * Loop tiling for L1 and L2 caches |
| 11 | * Loop fusion |
| 12 | * Parallelization for multicore machines |
| 13 | * Register tiling (for rectangular iteration spaces) |
| 14 | |
| 15 | ''Syntactic'' transformations (from ancc modules): |
| 16 | * Register tiling (for both rectangular and non-rectangular iteration spaces) |
| 17 | * Loop permutation/interchange |
| 18 | * Scalar replacement (to enhance register reuse) |
| 19 | |
| 20 | It is to be noted that the register tiling approach used by PLuTo is limited to only rectangular loops. To further improve the resulting performance, we implemented our own register tiling approach as one of the ancc's transformation modules. Our register tiling approach is so general that it can handle both rectangular and non-rectangular loops. |
20 | | === Optimizations used === |
21 | | We used PLuTo (an auto-parallelization and locality optimization tool based on polyhedral models) as a polyhedral-based code transformator. And we also extended ancc with additional modules used to perform syntactical transformations. Below are the (polyhedral and syntactic) optimizations used in this experiment. |
22 | | |
23 | | ''Polyhedral'' transformations (from PLuTo): |
24 | | * Loop tiling for L1 and L2 caches |
25 | | * Parallelization for multicore machines |
26 | | * Register tiling (for rectangular iteration spaces) |
27 | | |
28 | | ''Syntactic'' transformations (from ancc modules): |
29 | | * Register tiling (for both rectangular and non-rectangular iteration spaces) |
30 | | * Loop permutation/interchange |
31 | | * Scalar replacement (to enhance register reuse) |
32 | | |
33 | | It is to be noted that the register tiling approach used by PLuTo is limited to only rectangular loops. To further improve the resulting performance, we implemented our own register tiling approach as one of the ancc's transformation modules. Our register tiling approach is so general that it can handle both rectangular and non-rectangular loops. |
34 | | |