Last modified 9 years ago Last modified on 09/10/10 08:56:48

FLASH Tuning Notes

Test problems

  • To run any of these test problems on a "uniform grid," edit the runtime parameter file so that lrefine_min = lrefine_max and nrefs = 999999 . This will make sure that the refinement is uniform everywhere and that no refinement tests are carried out during the run (only during initialization).
  • Another way to get a uniform grid is to use the UG unit instead of Paramesh. To get this, run setup using the +ug option. I don't know whether all test problems are compatible with UG, but they should be if the UG unit is implemented correctly.

Sedov (hydrodynamics only)

  • Set up on cookie using ./setup Sedov -auto -maxblocks=10000 +noio -with-unit=monitors/Timers/TimersMain/Tau -objdir=obj_sedov_tau -tau=/disks/soft/tau-dev/x86_64/lib/Makefile.tau-phase-papiwallclock-papivirtual-papi-mpi-pdt
  • This builds FLASH for a 2D Sedov explosion problem. This is a spherically symmetric explosion propagating outward from a pressure perturbation at a single point into a uniform-density region with low pressure. Because the explosion expands with time, the portion of the grid that is refined increases with time.
  • Attached: example runtime parameter file, test script to be run from FLASH3 root directory.
  • The maximum numbers of blocks that can be used on cookie are 28308 (with all monitors disabled) and 28296 (with built-in timers). [BN]

Maclaurin (hydro plus multigrid self-gravity)

  • Set up on cookie using ./setup MacLaurin -auto -3d -maxblocks=1000 +noio +gravmgrid -with-unit=monitors/Timers/TimersMain/Tau -objdir=obj_maclaurin_tau -tau=/disks/soft/tau-dev/x86_64/lib/Makefile.tau-phase-papiwallclock-papivirtual-papi-mpi-pdt
  • This builds FLASH for a 3D Maclaurin spheroid problem. Maclaurin spheroids are constant-density, uniformly rotating, self-gravitating spheroids with analytic potential solutions. This problem sets up a Maclaurin spheroid in hydrostatic equilbrium and evolves it forward in time. Ideally, nothing should happen (the code should maintain the equilibrium). So the refinement pattern shouldn't vary too much with time.
  • Note that this is a problem with "isolated boundaries," ie. it is assumed that there is no matter outside the simulation box when computing the gravitational potential. This requires two multigrid calls per Poisson solve: the first solves the homogeneous problem, then a multipole expansion is used to compute inhomogeneous boundary values up to some limiting multipole order, then the Laplace equation is solved with these inhomogeneous boundaries. The two multigrid solutions are combined to give the final solution.
  • Attached: example runtime parameter file, test script to be run from FLASH3 root directory.

Zel'dovich pancake (hydro plus particles plus cosmology plus multigrid self-gravity)

  • Set up on cookie using ./setup Pancake -auto -3d -maxblocks=1000 +noio +gravmgrid -with-unit=monitors/Timers/TimersMain/Tau -objdir=obj_pancake_tau -tau=/disks/soft/tau-dev/x86_64/lib/Makefile.tau-phase-papiwallclock-papivirtual-papi-mpi-pdt
  • This builds FLASH for a 3D Zel'dovich pancake problem. A Zel'dovich pancake is a plane-parallel, gravitationally unstable perturbation in an expanding universe. For cosmological problems, FLASH works in comoving coordinates (the coordinate system stretches with the expansion of the universe), which leads to some additional damping terms in the momentum and energy equations along with a time-dependent rescaling of the Poisson equation.
  • The perturbation starts off with a sinusoidal form and a very small amplitude, eventually develops a density and pressure caustic at each positive perturbation maximum, and produces shock waves that expand outward from each caustic. Hence the refinement is initially uniform, then develops a refined area at the midplane of each perturbation that grows as the shock waves expand outward.
  • The boundary conditions here are periodic, so only one multigrid call is needed per Poisson solve.
  • There appear to be some debugging print statements that confuse the output; I'll see if I can fix these. Also, it may be necessary to run this problem for more steps if we want to see effects due to the changing refinement pattern.
  • Attached: example runtime parameter file, test script to be run from FLASH3 root directory.

Autotuning requirements

  • Fortran support
  • Inlining of small subroutines, especially those that have several consecutive invocations
  • Loop fusion
  • Loop unrolling
  • Different levels of tiling -- employing polyhedral techniques
  • Eliminating array copies