| 1 | \documentclass[dvipdfm,11pt]{article} |
|---|
| 2 | \usepackage[dvipdfm]{hyperref} % Upgraded url package |
|---|
| 3 | \parskip=.1in |
|---|
| 4 | |
|---|
| 5 | % Formatting conventions for contributors |
|---|
| 6 | % |
|---|
| 7 | % A quoting mechanism is needed to set off things like file names, command |
|---|
| 8 | % names, code fragments, and other strings that would confuse the flow of |
|---|
| 9 | % text if left undistinguished from preceding and following text. In this |
|---|
| 10 | % document we use the LaTeX macro '\texttt' to indicate such text in the |
|---|
| 11 | % source, which normally produces, when used as in '\texttt{special text}', |
|---|
| 12 | % the typewriter font. |
|---|
| 13 | |
|---|
| 14 | % It is particularly easy to use this convention if one is using emacs as |
|---|
| 15 | % the editor and LaTeX mode within emacs for editing LaTeX documents. In |
|---|
| 16 | % such a case the key sequence ^C^F^T (hold down the control key and type |
|---|
| 17 | % 'cft') produces '\texttt{}' with the cursor positioned between the |
|---|
| 18 | % braces, ready for the special text to be typed. The closing brace can |
|---|
| 19 | % be skipped over by typing ^e (go to the end of the line) if entering |
|---|
| 20 | % text or ^C-} to just move the cursor past the brace. |
|---|
| 21 | |
|---|
| 22 | % LaTeX mode is usually loaded automatically. At Argonne, one way to |
|---|
| 23 | % get several useful emacs tools working for you automatically is to put |
|---|
| 24 | % the following in your .emacs file. |
|---|
| 25 | |
|---|
| 26 | % (require 'tex-site) |
|---|
| 27 | % (setq LaTeX-mode-hook '(lambda () |
|---|
| 28 | % (auto-fill-mode 1) |
|---|
| 29 | % (flyspell-mode 1) |
|---|
| 30 | % (reftex-mode 1) |
|---|
| 31 | % (setq TeX-command "latex"))) |
|---|
| 32 | |
|---|
| 33 | |
|---|
| 34 | \begin{document} |
|---|
| 35 | \markright{MPICH2 Windows Development Guide} |
|---|
| 36 | \title{{\bf MPICH2 Windows Development Guide}\thanks{This work was supported by the |
|---|
| 37 | Mathematical, Information, and Computational Sciences Division |
|---|
| 38 | subprogram of the Office of Advanced Scientific Computing Research, |
|---|
| 39 | SciDAC Program, Office of Science, U.S. Department of Energy, under |
|---|
| 40 | Contract DE-AC02-06CH11357.}\\ |
|---|
| 41 | Version %VERSION%\\ |
|---|
| 42 | Mathematics and Computer Science Division\\ |
|---|
| 43 | Argonne National Laboratory} |
|---|
| 44 | |
|---|
| 45 | \author{David Ashton\\ |
|---|
| 46 | Jayesh Krishna} |
|---|
| 47 | |
|---|
| 48 | \maketitle |
|---|
| 49 | \cleardoublepage |
|---|
| 50 | |
|---|
| 51 | \pagenumbering{roman} |
|---|
| 52 | \tableofcontents |
|---|
| 53 | \clearpage |
|---|
| 54 | |
|---|
| 55 | \pagenumbering{arabic} |
|---|
| 56 | \pagestyle{headings} |
|---|
| 57 | |
|---|
| 58 | \section{Introduction} |
|---|
| 59 | \label{sec:intro} |
|---|
| 60 | This manual describes how to set up a Windows machine to build and test MPICH2 on. |
|---|
| 61 | |
|---|
| 62 | \section{Build machine} |
|---|
| 63 | \label{sec:machine} |
|---|
| 64 | |
|---|
| 65 | Build a Windows XP or Windows Server 2003 machine. This machine should have access to |
|---|
| 66 | the internet to be able to download the MPICH2 source code. |
|---|
| 67 | |
|---|
| 68 | \section{Test machine} |
|---|
| 69 | \label{sec:test_machine} |
|---|
| 70 | |
|---|
| 71 | Build a Windows XP or Windows Server 2003 machine on a 32bit CPU. |
|---|
| 72 | Also build a Windows Server 2003 X64 machine to test the Win64 distribution. |
|---|
| 73 | |
|---|
| 74 | \section{Software} |
|---|
| 75 | |
|---|
| 76 | This section describes the software necessary to build MPICH2. |
|---|
| 77 | |
|---|
| 78 | \subsection{Packages} |
|---|
| 79 | \label{sec:packages} |
|---|
| 80 | |
|---|
| 81 | To build MPICH2 you will need: |
|---|
| 82 | \begin{enumerate} |
|---|
| 83 | \item Microsoft Visual Studio 2005 |
|---|
| 84 | \item The latest version of Microsoft .NET framework |
|---|
| 85 | \item Microsoft Platform SDK |
|---|
| 86 | \item Cygwin - full installation |
|---|
| 87 | \item Intel Fortran compiler IA32 |
|---|
| 88 | \item Intel Fortran compiler EMT64 |
|---|
| 89 | \item Java SDK |
|---|
| 90 | \end{enumerate} |
|---|
| 91 | |
|---|
| 92 | Microsoft Visual Studio 2005 can be found on the CDs from an MSDN subscription. |
|---|
| 93 | |
|---|
| 94 | The Platform SDK can also be found on the MSDN CDs or downloaded from Microsoft.com. The |
|---|
| 95 | latest version as of the writing of this document was Platform SDK - Windows Server 2003 SP1. |
|---|
| 96 | The platform SDK usually has an up-to-date version of headers and libraries. |
|---|
| 97 | |
|---|
| 98 | The Intel Fortran compilers need to be installed after Developer Studio and the PSDK because |
|---|
| 99 | they integrate themselves into those two products. The regular IA32 compiler needs to be |
|---|
| 100 | installed and the EMT64 compiler needs to be installed. They are two separate packages and |
|---|
| 101 | they require a license file to use. The license file is for a single user on a single |
|---|
| 102 | machine. |
|---|
| 103 | |
|---|
| 104 | Cygwin needs to be installed to get svn, perl and ssh. By default the Cygwin installer might |
|---|
| 105 | not install all the required packages, so make sure that the required packages are selected |
|---|
| 106 | during the install. MPICH2 also requires autoconf version 2.62 or above. The OpenPA library |
|---|
| 107 | used by MPICH2 requires the automake package. Select to use the DOS file format when installing |
|---|
| 108 | Cygwin. |
|---|
| 109 | |
|---|
| 110 | Assuming you installed Cygwin to the default \texttt{c:$\backslash$cygwin} directory, add |
|---|
| 111 | \texttt{c:$\backslash$cygwin$\backslash$bin} to your PATH environment variable. This is |
|---|
| 112 | required so the automated scripts can run tools like ssh and perl without specifying the |
|---|
| 113 | full path. |
|---|
| 114 | |
|---|
| 115 | The Java SDK needs to be installed so the logging library can be compiled. After installing |
|---|
| 116 | the SDK set the JAVA\_HOME environment variable to point to the installation directory. |
|---|
| 117 | |
|---|
| 118 | Run the following command from a command prompt to change the Windows script engine from |
|---|
| 119 | GUI mode to console mode: |
|---|
| 120 | \begin{verbatim} |
|---|
| 121 | cscript //H:cscript |
|---|
| 122 | \end{verbatim} |
|---|
| 123 | |
|---|
| 124 | \section{Building MPICH2} |
|---|
| 125 | \label{sec:building} |
|---|
| 126 | |
|---|
| 127 | This section describes how to make various packages once you have a working build machine. |
|---|
| 128 | |
|---|
| 129 | \subsection{Visual Studio automated 32bit build} |
|---|
| 130 | \label{sec:vsbuild} |
|---|
| 131 | |
|---|
| 132 | The easist way to build an MPICH2 distribution is to use the Visual Studio environment |
|---|
| 133 | and the makewindist.bat script from the top level of the mpich2 source tree. You can check |
|---|
| 134 | out mpich2 from SVN or you can simply copy this batch file from the distribution. The batch |
|---|
| 135 | file knows how to check out mpich2 so it the only file required to make a distribution. |
|---|
| 136 | |
|---|
| 137 | The product GUIDs need to be changed when a new release is created. To do this run |
|---|
| 138 | ``\texttt{perl update\_windows\_version $<$new\_version$>$}''. Run this script with mpich2/maint |
|---|
| 139 | as the current directory so the project files can be found. Example: |
|---|
| 140 | \begin{verbatim} |
|---|
| 141 | perl update_windows_version 1.0.8 |
|---|
| 142 | \end{verbatim} |
|---|
| 143 | |
|---|
| 144 | Or you can modify the project files by hand. Edit mpich2/maint/mpich2i.vdproj. The ProductCode |
|---|
| 145 | and PackageCode entries need to be changed to use new GUIDs. Under Unix or Windows, uuidgen can |
|---|
| 146 | be used to generate a new GUID. The ProductVersion entry needs to be changed to match the |
|---|
| 147 | version of MPICH2. Once the version and GUIDs have been updated, commit the changes to |
|---|
| 148 | mpich2i.vdproj to SVN. Now you can build a distribution. |
|---|
| 149 | |
|---|
| 150 | Bring up a build command prompt by selecting Start$\to$Progams$\to$Microsoft Visual Studio |
|---|
| 151 | 2005$\to$Visual Studio 2005 Tools$\to$Visual Studio 2005 Command Prompt. |
|---|
| 152 | |
|---|
| 153 | Change directories to wherever you want to create the distribution. mpich2 will be checked |
|---|
| 154 | out under the current directory. Run the makewindist batch file: |
|---|
| 155 | |
|---|
| 156 | \begin{verbatim} |
|---|
| 157 | makewindist.bat --with-checkout |
|---|
| 158 | \end{verbatim} |
|---|
| 159 | |
|---|
| 160 | The batch file executes the following steps: |
|---|
| 161 | \begin{enumerate} |
|---|
| 162 | \item Check out trunk from the MPICH2 svn repository. |
|---|
| 163 | \item Run \texttt{maint/updatefiles} to generate the autogenerated files |
|---|
| 164 | \item Run ``\texttt{winconfigure.wsf --cleancode}'' to configure mpich2 for Windows and output |
|---|
| 165 | all the generated files like mpi.h and the fortran interface files, etc. |
|---|
| 166 | \item Run the Visual Studio command line tool to build all the components of MPICH2. This |
|---|
| 167 | includes each of the channels - sock, nemesis, ssm, shm, and the multi-threaded sock |
|---|
| 168 | channel. Two versions of each channel are built, the regular release build and the rlog |
|---|
| 169 | profiled version. The mpi wrapper channel selector dll is built and three Fortran interfaces |
|---|
| 170 | are built, one for each set of common symbol types and calling conventions. mpiexec and |
|---|
| 171 | smpd are built along with the Windows GUI tools and the Cygwin libraries. (These are the Cygwin |
|---|
| 172 | link libraries to use the Windows native build of MPICH2, not a Unix-style build of MPICH2 |
|---|
| 173 | under Cygwin.) |
|---|
| 174 | \item Package up everthing into \texttt{maint$\backslash$ReleaseMSI$\backslash$mpich2.msi}. |
|---|
| 175 | \end{enumerate} |
|---|
| 176 | |
|---|
| 177 | When the batch file is finished you will be left with a mpich2.msi file that can be used to |
|---|
| 178 | install MPICH2 on any Win32 machine. This file can be re-named to match the release naming |
|---|
| 179 | conventions. |
|---|
| 180 | |
|---|
| 181 | \subsubsection{Automated build from the source distribution} |
|---|
| 182 | Follow the steps mentioned below to build MPICH2 from a source tarball. |
|---|
| 183 | \begin{enumerate} |
|---|
| 184 | \item unzip/untar the source distribution |
|---|
| 185 | \item Open a Visual Studio Command Prompt |
|---|
| 186 | \item cd into the mpich2xxx directory |
|---|
| 187 | \item execute ``\texttt{winconfigure.wsf --cleancode}'' |
|---|
| 188 | \item execute ``\texttt{makewindist.bat --with-curdir}'' |
|---|
| 189 | \end{enumerate} |
|---|
| 190 | |
|---|
| 191 | \subsubsection{Building without Fortran} |
|---|
| 192 | If you don't have a Fortran compiler you can use winconfigure.wsf to remove the |
|---|
| 193 | Fortran projects. Execute \texttt{winconfigure.wsf --remove-fortran --cleancode} |
|---|
| 194 | Then you can build the projects without Fortran support. If you want to use the |
|---|
| 195 | \texttt{makewindist.bat} script you will need to remove the Fortran lines from |
|---|
| 196 | it before executing it. |
|---|
| 197 | |
|---|
| 198 | \subsection{Platform SDK builds} |
|---|
| 199 | \label{sec:psdk_build} |
|---|
| 200 | |
|---|
| 201 | The makefile in the \texttt{mpich2$\backslash$winbuild} directory builds a distribution based |
|---|
| 202 | on the compilers specified in the environment. The following targets can all be built with |
|---|
| 203 | this mechanism: |
|---|
| 204 | \begin{itemize} |
|---|
| 205 | \item Win64 X64 |
|---|
| 206 | \item Win64 IA64 |
|---|
| 207 | \item Win32 x86 |
|---|
| 208 | \end{itemize} |
|---|
| 209 | |
|---|
| 210 | Follow the steps below to build MPICH2. |
|---|
| 211 | \begin{enumerate} |
|---|
| 212 | \item Open a Cygwin bash shell and check out mpich2: |
|---|
| 213 | |
|---|
| 214 | \texttt{svn checkout https://svn.mcs.anl.gov/repos/mpi/mpich2/trunk mpich2}. |
|---|
| 215 | \item cd into mpich2 directory |
|---|
| 216 | \item run \texttt{maint/updatefiles} |
|---|
| 217 | \item Open a Visual Studio command prompt |
|---|
| 218 | \item From within the Visual Studio command prompt run |
|---|
| 219 | \texttt{winconfigure.wsf --cleancode} |
|---|
| 220 | \end{enumerate} |
|---|
| 221 | |
|---|
| 222 | To build the Win64 X64 distribution do the following: |
|---|
| 223 | \begin{enumerate} |
|---|
| 224 | \item Bring up a build command prompt from the PSDK. It can be found here: Start$\to$Programs |
|---|
| 225 | $\to$Microsoft Platform SDK for Windows Server 2003 SP1$\to$Open Build Environment Window$\to$ |
|---|
| 226 | Windows Server 2003 64-bit Build Environment$\to$Set Win Svr 2003 x64 Build Env (Retail) |
|---|
| 227 | \item Run \texttt{$\backslash$Program Files$\backslash$Intel$\backslash$Fortran$\backslash$compiler80$\backslash$Ia32e$\backslash$Bin$\backslash$ifortvars.bat} |
|---|
| 228 | \item cd into \texttt{mpich2$\backslash$winbuild} |
|---|
| 229 | \item run \texttt{build.bat 2>\&1 | tee build.x64.out} |
|---|
| 230 | \end{enumerate} |
|---|
| 231 | |
|---|
| 232 | For building the installer for Win64 x64, open the mpich2 solution file, mpich2.sln, using |
|---|
| 233 | Visual Studio 2005 and build the Installerx64 solution. The installer, \texttt{mpich2.msi} |
|---|
| 234 | will be available at \texttt{mpich2$\backslash$maint$\backslash$ReleaseMSIx64} directory. |
|---|
| 235 | |
|---|
| 236 | |
|---|
| 237 | The Visual Studio 2005 compiler provides a Cross tools command prompt for building |
|---|
| 238 | X64 applications. However the current makefile depends on environment variables not available |
|---|
| 239 | with the Cross tools command prompt. |
|---|
| 240 | |
|---|
| 241 | To build the Win64 IA64 distribution do the following: |
|---|
| 242 | \begin{enumerate} |
|---|
| 243 | \item Bring up a build command prompt from the PSDK. It can be found here: Start$\to$Programs |
|---|
| 244 | $\to$Microsoft Platform SDK for Windows Server 2003 SP1$\to$Open Build Environment Window$\to$ |
|---|
| 245 | Windows Server 2003 64-bit Build Environment$\to$Set Win Svr 2003 IA64 Build Env (Retail) |
|---|
| 246 | \item Run \texttt{$\backslash$Program Files$\backslash$Intel$\backslash$Fortran$\backslash$compiler80$\backslash$Itanium$\backslash$Bin$\backslash$ifortvars.bat} |
|---|
| 247 | \item cd into \texttt{mpich2$\backslash$winbuild} |
|---|
| 248 | \item run \texttt{build.bat 2>\&1 | tee build.ia64.out} |
|---|
| 249 | \end{enumerate} |
|---|
| 250 | |
|---|
| 251 | To build the Win32 x86 distribution do the following: |
|---|
| 252 | \begin{enumerate} |
|---|
| 253 | \item Bring up a build command prompt from the PSDK. It can be found here: Start$\to$Programs |
|---|
| 254 | $\to$Microsoft Platform SDK for Windows Server 2003 SP1$\to$Open Build Environment Window$\to$ |
|---|
| 255 | Windows 2000 Build Environment$\to$Set Windows 2000 Build Environment (Retail) |
|---|
| 256 | \item Run \texttt{$\backslash$Program Files$\backslash$Intel$\backslash$Fortran$\backslash$compiler80$\backslash$Ia32$\backslash$Bin$\backslash$ifortvars.bat} |
|---|
| 257 | \item cd into \texttt{mpich2$\backslash$winbuild} |
|---|
| 258 | \item run \texttt{build.bat 2>\&1 | tee build.x86.out} |
|---|
| 259 | \end{enumerate} |
|---|
| 260 | |
|---|
| 261 | \section{Distributing MPICH2 builds} |
|---|
| 262 | \label{sec:distribute} |
|---|
| 263 | |
|---|
| 264 | If you built an .msi file using the Visual Studio build process \ref{sec:vsbuild} then |
|---|
| 265 | all you have to do is rename the \texttt{mpich2.msi} file to something appropriate like |
|---|
| 266 | \texttt{mpich2-1.0.3-1-win32-ia32.msi} |
|---|
| 267 | |
|---|
| 268 | If you built using the Platform SDK build process \ref{sec:psdk_build} then the output files |
|---|
| 269 | are left in their build locations and need to be collected and put in a zip file for |
|---|
| 270 | distributing. This process should be automated with a script. |
|---|
| 271 | |
|---|
| 272 | \section{Testing MPICH2} |
|---|
| 273 | \label{sec:testing} |
|---|
| 274 | |
|---|
| 275 | Run the \texttt{testmpich2.wsf} script to checkout mpich2, build it, install it, checkout |
|---|
| 276 | the test suites, build them, run the test suites, and collect the results in a web page. |
|---|
| 277 | |
|---|
| 278 | \subsection{Testing from scratch} |
|---|
| 279 | Explain the use of testmpich2.wsf. |
|---|
| 280 | |
|---|
| 281 | Run ``\texttt{testmpich2.wsf}'' without any parameters and it will create a \texttt{testmpich2} |
|---|
| 282 | subdirectory and check out into that directory mpich2 and the test suites - c++, mpich, intel |
|---|
| 283 | and mpich2. It will then build mpich2 and all the tests from the test suites. Then it will run |
|---|
| 284 | the tests and place a summary in \texttt{testmpich2$\backslash$summary$\backslash$index.html}. |
|---|
| 285 | |
|---|
| 286 | \subsection{Testing a built mpich2 directory} |
|---|
| 287 | Explain how to run \texttt{testmpich2.wsf} if you have the mpich2 source tree on a machine and you |
|---|
| 288 | have already built all of mpich2. |
|---|
| 289 | |
|---|
| 290 | Here is a sample batch file to test mpich2 that has already been built in c:$\backslash$mpich2: |
|---|
| 291 | \begin{verbatim} |
|---|
| 292 | testmpich2.wsf /mpich2:c:\mpich2 /make- /configure- /buildbatch |
|---|
| 293 | pushd testmpich2\buildMPICH |
|---|
| 294 | call mpich_cmds.bat |
|---|
| 295 | popd |
|---|
| 296 | pushd testmpich2\buildCPP |
|---|
| 297 | call cpp_cmds.bat |
|---|
| 298 | popd |
|---|
| 299 | pushd testmpich2\buildINTEL |
|---|
| 300 | call intel_cmds.bat |
|---|
| 301 | popd |
|---|
| 302 | pushd testmpich2\buildMPICH2 |
|---|
| 303 | call mpich2_cmds.bat |
|---|
| 304 | popd |
|---|
| 305 | testmpich2.wsf /mpich2:c:\mpich2 /make- /configure- /summarize |
|---|
| 306 | \end{verbatim} |
|---|
| 307 | |
|---|
| 308 | \subsection{Testing an existing installation} |
|---|
| 309 | Explain the use of testmpich2.wsf to test an existing installation, one that was installed |
|---|
| 310 | with the .msi distribution. |
|---|
| 311 | |
|---|
| 312 | \section{Development issues} |
|---|
| 313 | This section describes development issues that are particular to the Windows build. |
|---|
| 314 | |
|---|
| 315 | Whenever a .h.in file is created on the Unix side, winconfigure.wsf needs to be updated to |
|---|
| 316 | create the .h file from the .h.in file. Copy and paste an existing section in |
|---|
| 317 | winconfigure.wsf that already does this and rename the file names. |
|---|
| 318 | |
|---|
| 319 | When new definitions are added to the .h.in files these definitions, usually in the form HAVE\_FOO |
|---|
| 320 | or USE\_FOO, need to be added to the AddDefinitions function in winconfigure.wsf. Simply add |
|---|
| 321 | new cases to the big case statement as needed. winconfigure.wsf warns you of definitions that |
|---|
| 322 | are not in the case statement. |
|---|
| 323 | |
|---|
| 324 | Whenever a @FOO@ substitution is added on the Unix side, winconfigure.wsf needs to be updated |
|---|
| 325 | to handle the substitution. Find the ReplaceAts function in winconfigure.wsf and add the |
|---|
| 326 | substitution to the big case statement. winconfigure.wsf warns you of new substitutions that |
|---|
| 327 | have not been added to the case statement. |
|---|
| 328 | |
|---|
| 329 | \section{Runtime environment} |
|---|
| 330 | |
|---|
| 331 | This section describes the MPICH2 environment that is particular to Windows. |
|---|
| 332 | |
|---|
| 333 | \subsection{User credentials} |
|---|
| 334 | mpiexec must have the user name and password to launch MPI applications in the context of |
|---|
| 335 | that user. This information can be stored in a secure encrypted manner for each user on a |
|---|
| 336 | machine. Run \texttt{mpiexec -register} to save your username and password. Then mpiexec |
|---|
| 337 | will not prompt you for this information. |
|---|
| 338 | |
|---|
| 339 | This is also true for a nightly build script. The user context under which the script is |
|---|
| 340 | run must have saved credentials so mpiexec doesn't prompt for them. So scripts won't hang, |
|---|
| 341 | mpiexec provides a flag, \texttt{-noprompt}, that will cause mpiexec to print out errors in |
|---|
| 342 | cases when it normally would prompt for user input. This can also be specified in the |
|---|
| 343 | environment with the variable MPIEXEC\_NOPROMPT. |
|---|
| 344 | |
|---|
| 345 | You can also save more than one set of user credentials. Add the option \texttt{-user n} |
|---|
| 346 | to the \texttt{-register}, \texttt{-remove}, \texttt{-validate}, and \texttt{mpiexec} |
|---|
| 347 | commands to specify a saved user credential other than the default. The parameter \texttt{n} |
|---|
| 348 | is a non-zero positive number. For example this will save credentials in slot 1: |
|---|
| 349 | \begin{verbatim} |
|---|
| 350 | mpiexec -register -user 1 |
|---|
| 351 | \end{verbatim} |
|---|
| 352 | And this command will use the user 3 to launch a job: |
|---|
| 353 | \begin{verbatim} |
|---|
| 354 | mpiexec -user 3 -n 4 cpi.exe |
|---|
| 355 | \end{verbatim} |
|---|
| 356 | |
|---|
| 357 | User credentials can also be specified in a file using the \texttt{-pwdfile filename} |
|---|
| 358 | option to mpiexec. Put the username on the first line of the file and the password |
|---|
| 359 | on the second line. If you choose this option you should make sure the file is |
|---|
| 360 | only readable by the current user. |
|---|
| 361 | |
|---|
| 362 | \subsection{MPICH2 channel selection} |
|---|
| 363 | MPICH2 for Windows comes with multiple complete implementations of MPI. These are called |
|---|
| 364 | channels and each build represents a different transport mechanism used to move MPI messages. |
|---|
| 365 | The default channel (sock) uses sockets for communication. There is channel that use only |
|---|
| 366 | shared memory (shm). There are two channels that uses both sockets and shared memory |
|---|
| 367 | (nemesis, ssm). And there is a thread-safe version of the sockets channel (mt). We recommend |
|---|
| 368 | users to use the sock, mt or nemesis channels. The shm and ssm channels will soon be deprecated. |
|---|
| 369 | |
|---|
| 370 | The short names for the channels are: sock, nemesis, shm, ssm, mt. |
|---|
| 371 | |
|---|
| 372 | These channels can be selected at runtime with an environment variable: MPICH2\_CHANNEL. |
|---|
| 373 | The following is an example that uses the nemesis channel instead of the default sockets |
|---|
| 374 | channel: |
|---|
| 375 | |
|---|
| 376 | \begin{verbatim} |
|---|
| 377 | mpiexec -env MPICH2_CHANNEL nemesis -n 4 myapp.exe |
|---|
| 378 | or |
|---|
| 379 | mpiexec -channel nemesis -n 4 myapp.exe |
|---|
| 380 | \end{verbatim} |
|---|
| 381 | |
|---|
| 382 | If you specify \texttt{auto} for the channel then mpiexec will automatically choose a |
|---|
| 383 | channel for you. |
|---|
| 384 | \begin{verbatim} |
|---|
| 385 | mpiexec -channel auto -n 4 myapp.exe |
|---|
| 386 | \end{verbatim} |
|---|
| 387 | The rules are: |
|---|
| 388 | \begin{enumerate} |
|---|
| 389 | \item If numprocs is less than 8 on one machine, use the shm channel |
|---|
| 390 | \item If running on multiple machines, use the ssm channel. This channel can be changed |
|---|
| 391 | using winconfigure. |
|---|
| 392 | \end{enumerate} |
|---|
| 393 | |
|---|
| 394 | \subsection{MPI apps with GUI} |
|---|
| 395 | Many users on Windows machines want to build GUI apps that are also MPI applications. This is |
|---|
| 396 | completely acceptable as long as the application follows the rules of MPI. MPI\_Init must be |
|---|
| 397 | called before any other MPI function and it needs to be called soon after each process starts. |
|---|
| 398 | The processes must be started with mpiexec but they are not required to be console applications. |
|---|
| 399 | |
|---|
| 400 | The one catch is that MPI applications are hidden from view so any Windows that a user |
|---|
| 401 | application brings up will not be able to be seen. mpiexec has an option to allow the MPI |
|---|
| 402 | processes on the local machine to be able to bring up GUIs. Add -localroot to the mpiexec |
|---|
| 403 | command to enable this capability. But even with this option, all GUIs from processes on |
|---|
| 404 | remote machines will be hidden. |
|---|
| 405 | |
|---|
| 406 | So the only GUI application that MPICH2 cannot handle by default would be a video-wall type |
|---|
| 407 | application. But this can be done by running smpd.exe by hand on each machine instead of |
|---|
| 408 | installing it as a service. Log on to each machine and run ``\texttt{smpd.exe -stop}'' |
|---|
| 409 | to stop the service and then run ``\texttt{smpd.exe -d 0}'' to start up the smpd again. |
|---|
| 410 | As long as this process is running you will be able to run applications where every process |
|---|
| 411 | is allowed to bring up GUIs. |
|---|
| 412 | |
|---|
| 413 | \subsection{Security} |
|---|
| 414 | MPICH2 can use Microsoft's SSPI interface to launch processes without using any user |
|---|
| 415 | passwords. This is the most secure way to launch MPI jobs but it requires the machines to be |
|---|
| 416 | configured in a certain way. |
|---|
| 417 | \begin{itemize} |
|---|
| 418 | \item All machines must be part of a Windows domain. |
|---|
| 419 | \item Each machine must have delegation enabled. |
|---|
| 420 | \item Each user that will run jobs must be allowed to use delegation. |
|---|
| 421 | \end{itemize} |
|---|
| 422 | |
|---|
| 423 | If the machines are set up this way then an administrator can set up MPICH2 for passwordless |
|---|
| 424 | authentication. On each node, a domain administrator needs to execute the following: |
|---|
| 425 | ``\texttt{smpd -register\_spn}''. |
|---|
| 426 | |
|---|
| 427 | Then a user can add the \texttt{-delegate} flag to their mpiexec commands and the job startup |
|---|
| 428 | will be done without any passwords. Example: |
|---|
| 429 | \begin{verbatim} |
|---|
| 430 | mpiexec -delegate -n 3 cpi.exe |
|---|
| 431 | \end{verbatim} |
|---|
| 432 | |
|---|
| 433 | With SSPI enabled you can also control access to nodes with job objects. |
|---|
| 434 | |
|---|
| 435 | First the nodes need to be set up so that only SSPI authentication is allowed. An administrator |
|---|
| 436 | can run the following on each node: |
|---|
| 437 | \begin{enumerate} |
|---|
| 438 | \item \texttt{smpd.exe -set sspi\_protect yes} |
|---|
| 439 | \item \texttt{smpd.exe -set jobs\_only yes} |
|---|
| 440 | \item \texttt{smpd.exe -restart} |
|---|
| 441 | \end{enumerate} |
|---|
| 442 | |
|---|
| 443 | These settings mean that authentication must be done through SSPI and mpiexec commands will only be |
|---|
| 444 | accepted for registered jobs. |
|---|
| 445 | |
|---|
| 446 | To register jobs an administrator or a scheduler running with administrator privileges can execute |
|---|
| 447 | the following command: |
|---|
| 448 | \begin{verbatim} |
|---|
| 449 | mpiexec.exe -add_job <name> <domain\username> [-host <hostname>] |
|---|
| 450 | \end{verbatim} |
|---|
| 451 | This adds a job called ``name'' for the specified user on either the local or specified host. Any |
|---|
| 452 | name can be used but it must not collide with another job with the same name on the same host. The |
|---|
| 453 | command must be executed for each host that is to be allocated to the user. |
|---|
| 454 | |
|---|
| 455 | Then when the job has finished or the allotted time has expired for the user to use the nodes the |
|---|
| 456 | following command can be executed: |
|---|
| 457 | \begin{verbatim} |
|---|
| 458 | mpiexec.exe -remove_job <name> [-host <hostname>] |
|---|
| 459 | \end{verbatim} |
|---|
| 460 | This command removes the job from the local or specified host. Any processes running on the host |
|---|
| 461 | under the specified job name will be terminated by this command. |
|---|
| 462 | |
|---|
| 463 | So \texttt{-add\_job} and \texttt{-remove\_job} can be used by a scheduler to create a window when a user is allowed to |
|---|
| 464 | start jobs on a set of nodes. |
|---|
| 465 | |
|---|
| 466 | When the window is open the user can run jobs using the job name. First the user must run: |
|---|
| 467 | \begin{verbatim} |
|---|
| 468 | mpiexec.exe -associate_job <name> [-host <hostname>] |
|---|
| 469 | \end{verbatim} |
|---|
| 470 | This will associate the user's token with the job object on the local or |
|---|
| 471 | specified host. This must be done for all of the hosts allocated to the user. Then the user can issue |
|---|
| 472 | mpiexec commands. The mpiexec commands are of the usual format except they must contain one extra option - |
|---|
| 473 | ``\texttt{-job <name>}''. This job name must match the job allocated by the \texttt{-add\_job} command. So a typical command |
|---|
| 474 | would look like this: |
|---|
| 475 | \begin{verbatim} |
|---|
| 476 | mpiexec.exe -job foo -machinefile hosts.txt -n 4 myapp.exe |
|---|
| 477 | \end{verbatim} |
|---|
| 478 | Multiple mpiexec commands can be issued until the \texttt{-remove\_job} command is issued. |
|---|
| 479 | This allows the users to issue multiple |
|---|
| 480 | mpiexec commands and multiple MPI\_Comm\_spawn commands all using the same job name until the job is removed |
|---|
| 481 | from the nodes. |
|---|
| 482 | |
|---|
| 483 | The rationale for the design where an adminstrator can create and destroy jobs but the user must first associate |
|---|
| 484 | the job with his own token before running jobs is so that the administrator does not need to know the user's |
|---|
| 485 | password. In order for an administrator to do both the job allocation and association he would have to call |
|---|
| 486 | LogonUser with the user name and password for each user that submits a job request. |
|---|
| 487 | |
|---|
| 488 | \subsection{Firewalls} |
|---|
| 489 | Windows comes with a default firewall that is usually turned on by default. |
|---|
| 490 | Firewalls block all TCP ports by default which renders MPICH2 applications inoperable |
|---|
| 491 | because the default communication mechanism used by MPICH2 are sockets on arbitrary |
|---|
| 492 | ports assigned by the operating system. This can be solved in several ways: |
|---|
| 493 | |
|---|
| 494 | \begin{itemize} |
|---|
| 495 | \item Turn off the firewall completely. |
|---|
| 496 | \item MPICH2 applications can be limited to a range of TCP ports using the |
|---|
| 497 | MPICH\_PORT\_RANGE environment variable. If you set your firewall to allow the |
|---|
| 498 | same port range then MPICH2 applications will run. |
|---|
| 499 | \item Leave the Windows firewall on and allow exceptions for your MPICH2 applications. |
|---|
| 500 | This can be done through the Security Center module of the Windows Control |
|---|
| 501 | Panel. Click the Windows Firewall option in the Security Center to bring up |
|---|
| 502 | the properties page and select the Exceptions tab. Here you can add each |
|---|
| 503 | MPICH2 application to exempt. Note that this exception includes the path to |
|---|
| 504 | the executable so if you move the executable you will have to exempt the new |
|---|
| 505 | location. This solution obviously only will work for a small number of |
|---|
| 506 | applications since managing a large list would be difficult. Make sure you add |
|---|
| 507 | mpiexec.exe and the smpd.exe process manager to this exception list. |
|---|
| 508 | \end{itemize} |
|---|
| 509 | |
|---|
| 510 | \subsection{MPIEXEC options} |
|---|
| 511 | This section describes all the options to mpiexec.exe |
|---|
| 512 | |
|---|
| 513 | \begin{itemize} |
|---|
| 514 | |
|---|
| 515 | \item \texttt{-add\_job job\_name domain$\backslash$user [-host hostname]} |
|---|
| 516 | Create a job object on the local or specified host for the specified user. |
|---|
| 517 | Administrator privileges are required to execute this command. |
|---|
| 518 | |
|---|
| 519 | \item \texttt{-associate\_job job\_name [-host hostname]} |
|---|
| 520 | Associate the current user token with the specified job on the local or specified |
|---|
| 521 | host. The current user must match the user specifed by the \texttt{-add\_job job\_name username} |
|---|
| 522 | command. |
|---|
| 523 | |
|---|
| 524 | \item \texttt{-binding process\_binding\_scheme} |
|---|
| 525 | Specify a process binding scheme for the MPI processes. Currently only \texttt{auto} |
|---|
| 526 | is the supported binding scheme. Using \texttt{auto} as the process binding scheme |
|---|
| 527 | the process manager will choose the process binding scheme automatically taking into |
|---|
| 528 | account the load on system resources like caches. |
|---|
| 529 | |
|---|
| 530 | \item \texttt{-channel channel\_name} |
|---|
| 531 | This option is only available under Windows and allows the user to select which |
|---|
| 532 | channel implementation of MPICH2 to select at runtime. The current channels |
|---|
| 533 | supported are sock, mt, ssm, and shm. These represent the sockets, |
|---|
| 534 | multi-threaded sockets, sockets plus shared memory, and shared memory channels. |
|---|
| 535 | The shared memory channels only work on one node. The sockets, multi-threaded |
|---|
| 536 | sockets, and sockets plus shared memory channels work on multiple nodes. There |
|---|
| 537 | are also profiled versions of the channels that produce RLOG files for each |
|---|
| 538 | process when selected. They are named p, mtp, ssmp, and shmp. See the |
|---|
| 539 | section on channel selection for additional information. |
|---|
| 540 | |
|---|
| 541 | \item \texttt{-configfile filename} |
|---|
| 542 | Use the specified job configuration file to launch the job. Each line in the |
|---|
| 543 | file represents a set of options just like you would enter them on the \texttt{mpiexec} |
|---|
| 544 | command line. The one difference is that there are no colons in the file. The |
|---|
| 545 | colons are replaced by new-lines. |
|---|
| 546 | |
|---|
| 547 | \item \texttt{-delegate} |
|---|
| 548 | Specify that you want to use passwordless SSPI delegation to launch processes. |
|---|
| 549 | The machines must be configured to use SSPI as described in the section on |
|---|
| 550 | security. |
|---|
| 551 | |
|---|
| 552 | \item \texttt{-dir drive:$\backslash$my$\backslash$working$\backslash$directory} |
|---|
| 553 | Specify the working directory for the processes. |
|---|
| 554 | |
|---|
| 555 | \item \texttt{-env variable value} |
|---|
| 556 | Specify an environment variable and its value to set in the processes' environments. |
|---|
| 557 | This option can be specified multiple times. |
|---|
| 558 | |
|---|
| 559 | \item \texttt{-exitcodes} |
|---|
| 560 | Specify that the exit code of each process should be printed to stdout as each |
|---|
| 561 | processes exits. |
|---|
| 562 | |
|---|
| 563 | \item \texttt{-file filename} |
|---|
| 564 | Use the specified implementation specific job configuration file. For Windows |
|---|
| 565 | this option is used to specify the old MPICH 1.2.5 configuration file format. |
|---|
| 566 | This is useful for users who have existing configuration files and want to upgrade |
|---|
| 567 | to MPICH2. |
|---|
| 568 | |
|---|
| 569 | \item \texttt{-genvlist a,b,c,d...} |
|---|
| 570 | Specify a list of environment variables to taken from the environment local to mpiexec and propagated to the launched processes. |
|---|
| 571 | |
|---|
| 572 | \item \texttt{-hide\_console} |
|---|
| 573 | Detach from the console so that no command prompt window will appear and consequently |
|---|
| 574 | not output will be seen. |
|---|
| 575 | |
|---|
| 576 | \item \texttt{-host hostname} |
|---|
| 577 | Specify that the processes should be launched on a specific host. |
|---|
| 578 | |
|---|
| 579 | \item \texttt{-hosts n host1 host2 host3 ...} |
|---|
| 580 | Specify that the processes should be launched on a list of hosts. This option |
|---|
| 581 | replaces the \texttt{-n x} option. |
|---|
| 582 | |
|---|
| 583 | \item \texttt{-hosts n host1 m1 host2 m2 host3 m3 ...} |
|---|
| 584 | Specify that the processes should be launched on a list of hosts and how many |
|---|
| 585 | processes should be launched on each host. The total number of processes |
|---|
| 586 | launched is m1 + m2 + m3 + ... mn. |
|---|
| 587 | |
|---|
| 588 | \item \texttt{-impersonate} |
|---|
| 589 | Specify that you want to use passwordless SSPI impersonation to launch processes. |
|---|
| 590 | This will create processes on the remote machines with limited access tokens. |
|---|
| 591 | They wil not be able to open files on remote machines or access mapped network |
|---|
| 592 | drives. |
|---|
| 593 | |
|---|
| 594 | \item \texttt{-job job\_name} |
|---|
| 595 | Specify that the processes should be launched under the specifed job object. |
|---|
| 596 | This can only be used after successful calls to \texttt{-add\_job} and \texttt{-associate\_job}. |
|---|
| 597 | |
|---|
| 598 | \item \texttt{-l} |
|---|
| 599 | This flag causes mpiexec to prefix output to stdout and stderr with the rank of |
|---|
| 600 | the process that produced the output. (This option is the lower-case L not the |
|---|
| 601 | number one) |
|---|
| 602 | |
|---|
| 603 | \item \texttt{-localonly x} or \texttt{-localonly} |
|---|
| 604 | Specify that the processes should only be launched on the local host. This |
|---|
| 605 | option can replace the \texttt{-n x} option or be used in conjunction with it |
|---|
| 606 | when it is only a flag. |
|---|
| 607 | |
|---|
| 608 | \item \texttt{-localroot} |
|---|
| 609 | Specify that the root process should be launched on the local machine directly |
|---|
| 610 | from mpiexec bypassing the smpd process manager. This is useful for applications |
|---|
| 611 | that want to create windows from the root process that are visible to the interactive |
|---|
| 612 | user. The smpd process manager creates processes in a hidden service desktop |
|---|
| 613 | where you cannot interact with any GUI. |
|---|
| 614 | |
|---|
| 615 | \item \texttt{-log} |
|---|
| 616 | This option is a short cut to selecting the MPE wrapper library to log the MPI |
|---|
| 617 | application. When the job finishes there will be a .clog2 file created that |
|---|
| 618 | can be viewed in Jumpshot. |
|---|
| 619 | |
|---|
| 620 | \item \texttt{-logon} |
|---|
| 621 | Prompt for user credentials to launch the job under. |
|---|
| 622 | |
|---|
| 623 | \item \texttt{-machinefile filename} |
|---|
| 624 | Use the specified file to get host names to launch processes on. Hosts are |
|---|
| 625 | selected from this file in a round robin fashion. One host is specified per |
|---|
| 626 | line. Extra options can be specified. The number of desired processes to |
|---|
| 627 | launch on a specific host can be specified with a colon followed by a number |
|---|
| 628 | after the host name: \texttt{hostname:n}. This is usefull for multi-CPU hosts. |
|---|
| 629 | If you want to specify the interface that should be used for MPI communication |
|---|
| 630 | to the host you can add the \texttt{-ifhn} flag. A sample machinefile is provided |
|---|
| 631 | below for reference. |
|---|
| 632 | \begin{verbatim} |
|---|
| 633 | # Comment line |
|---|
| 634 | # Run two procs on hostname1 |
|---|
| 635 | hostname1:2 |
|---|
| 636 | # Run four procs on hostname2 but use 192.168.1.100 |
|---|
| 637 | # as the interface |
|---|
| 638 | hostname2:4 -ifhn 192.168.1.100 |
|---|
| 639 | \end{verbatim} |
|---|
| 640 | The interface can also be specified using the \texttt{ifhn=} option. The following |
|---|
| 641 | line is valid in a machinefile. |
|---|
| 642 | \begin{verbatim} |
|---|
| 643 | #Using ifhn= option to specify the interface |
|---|
| 644 | hostname1:2 ifhn=192.168.1.101 |
|---|
| 645 | \end{verbatim} |
|---|
| 646 | |
|---|
| 647 | \item \texttt{-map drive:$\backslash$$\backslash$host$\backslash$share} |
|---|
| 648 | Specify a network mapped drive to create on the hosts before launching the |
|---|
| 649 | processes. The mapping will be removed when the processes exit. This option |
|---|
| 650 | can be specified multiple times. |
|---|
| 651 | |
|---|
| 652 | \item \texttt{-mapall} |
|---|
| 653 | Specify that all network mapped drives created by the user executing mpiexec |
|---|
| 654 | command will be created on hosts before launching the processes. The mappings |
|---|
| 655 | will be removed when the processes exit. |
|---|
| 656 | |
|---|
| 657 | \item \texttt{-n x} or \texttt{-np x} |
|---|
| 658 | Specify the number of processes to launch. |
|---|
| 659 | |
|---|
| 660 | \item \texttt{-nopm} |
|---|
| 661 | This flag is used in conjunction with the \texttt{-rsh} flag. With this flag |
|---|
| 662 | specified there need not be any smpd process manager running on any of the nodes |
|---|
| 663 | used in the job. \texttt{mpiexec} provides the PMI interface and the remote |
|---|
| 664 | shell command is used to start the processes. Using these flags allows jobs to |
|---|
| 665 | be started without any process managers running but the MPI-2 dynamic process |
|---|
| 666 | functions like MPI\_Comm\_spawn are consequently not available. |
|---|
| 667 | |
|---|
| 668 | \item \texttt{-noprompt} |
|---|
| 669 | Prevent mpiexec for prompting for information. If user credentials are needed |
|---|
| 670 | to launch the processes mpiexec usually prompts for this information but this |
|---|
| 671 | flag causes an error to be printed out instead. |
|---|
| 672 | |
|---|
| 673 | \item \texttt{-p port} |
|---|
| 674 | Short version of the \texttt{-port} option. |
|---|
| 675 | |
|---|
| 676 | \item \texttt{-path search\_path} |
|---|
| 677 | Specify the search path used to locate executables. Separate multiple paths with semicolons. |
|---|
| 678 | The path can be mixed when using both Windows and Linux machines. For example: |
|---|
| 679 | \texttt{-path c:$\backslash$temp;/home/user} is a valid search path. |
|---|
| 680 | |
|---|
| 681 | \item \texttt{-phrase passphrase} |
|---|
| 682 | Specify the passphrase used to authenticate with the smpd process managers. |
|---|
| 683 | |
|---|
| 684 | \item \texttt{-plaintext} |
|---|
| 685 | Specify that user credentials should go over the wire un-encrypted. This is |
|---|
| 686 | required if both Linux and Windows machines are used in the same job because |
|---|
| 687 | the Linux machines cannot encrypt and decrypt the data created by the Windows |
|---|
| 688 | machines. |
|---|
| 689 | |
|---|
| 690 | \item \texttt{-pmi\_server num\_processes or -pmiserver num\_processes} |
|---|
| 691 | This option specified by itself connects to the local smpd process manager and |
|---|
| 692 | starts a PMI service. This service is used by MPICH2 processes to communicate |
|---|
| 693 | connection information to each other. This option is only good for a single |
|---|
| 694 | MPICH2 job. The input parameter is the number of processes in the job. |
|---|
| 695 | \texttt{mpiexec} immediately outputs three lines of data. The first line is the |
|---|
| 696 | host name. The second line is the port it is listening on and the third line |
|---|
| 697 | is the name of the PMI KVS. A process manager that can set environment variables |
|---|
| 698 | and launch processes but does not implement the PMI service can use this option |
|---|
| 699 | to start jobs. Along with the other PMI environment variables the process |
|---|
| 700 | manager must set PMI\_HOST to the host name provided, PMI\_PORT to the port |
|---|
| 701 | provided and PMI\_KVS and PMI\_DOMAIN to the KVS name provided. It is the |
|---|
| 702 | responsibility of the process manager to set the other environment variables |
|---|
| 703 | correctly like PMI\_RANK and PMI\_SIZE. See the document on the smpd PMI implementation |
|---|
| 704 | for a complete list of the environment variables. When the job is finished the |
|---|
| 705 | PMI server will exit. This option can be executed in separate command simultaneously |
|---|
| 706 | so that multiple jobs can be executed at the same time. |
|---|
| 707 | |
|---|
| 708 | \item \texttt{-port port} |
|---|
| 709 | Specify the port where the smpd process manager is listening. |
|---|
| 710 | |
|---|
| 711 | \item \texttt{-priority class[:level]} |
|---|
| 712 | Specify the priority class and optionally the thread priority of the processes |
|---|
| 713 | to be launched. The class can be 0,1,2,3, or 4 corresponding to idle, below, |
|---|
| 714 | normal, above, and high. The level can be 0,1,2,3,4, or 5 corresponding to |
|---|
| 715 | idle, lowest, below, normal, above, highest. The default is 2:3. |
|---|
| 716 | |
|---|
| 717 | \item \texttt{-pwdfile filename} |
|---|
| 718 | Specify a file to read the user name and password from. The user name should be |
|---|
| 719 | on the first line and the password on the second line. |
|---|
| 720 | |
|---|
| 721 | \item \texttt{-quiet\_abort} |
|---|
| 722 | Use this flag to prevent extensive abort messages to appear. Instead the job |
|---|
| 723 | will simply exit with minimal error output. |
|---|
| 724 | |
|---|
| 725 | \item \texttt{-register [-user n]} |
|---|
| 726 | Encrypt a user name and password into the Windows registry so that it can be |
|---|
| 727 | automatically retrieved by mpiexec to launch processes with. If you specify |
|---|
| 728 | a user index then you can save more than one set of credentials. The index |
|---|
| 729 | should be a positive non-zero number and does not need to be consecutive. |
|---|
| 730 | |
|---|
| 731 | \item \texttt{-remove [-user n]} |
|---|
| 732 | Remove the encrypted credential data from the Registry. If multiple entries are |
|---|
| 733 | saved then use the \texttt{-user} option to specify which entry to remove. |
|---|
| 734 | \texttt{-user all} can be specified to delete all entries. |
|---|
| 735 | |
|---|
| 736 | \item \texttt{-remove\_job job\_name [-host hostname]} |
|---|
| 737 | Remove a job object on the local or specified host. Any processes running under |
|---|
| 738 | this job will be terminated. Administrator privileges are required to execute |
|---|
| 739 | this command. |
|---|
| 740 | |
|---|
| 741 | \item \texttt{-rsh or -ssh} |
|---|
| 742 | Use the remote shell command to execute the processes in the job instead of |
|---|
| 743 | using the smpd process manager. The default command is ``\texttt{ssh -x}'' no |
|---|
| 744 | matter whether \texttt{-rsh} or \texttt{-ssh} is used. If this is the only |
|---|
| 745 | flag specified then an smpd process manager must be running on the local host |
|---|
| 746 | where \texttt{mpiexec} is executed. \texttt{mpiexec} contacts the local smpd process to start |
|---|
| 747 | a PMI service required by the MPI job and then starts the processes using the |
|---|
| 748 | remote shell command. On the target machines the application ``\texttt{env}'' |
|---|
| 749 | must be available since it is used to set the appropriate environment variables |
|---|
| 750 | and then start the application. The remote shell command can be changed using |
|---|
| 751 | the environment variable MPIEXEC\_RSH. Any command can be used that takes a |
|---|
| 752 | host name and then everything after that as the user command to be launched. |
|---|
| 753 | Note that you need to specify a fully qualified file name of the executable when running |
|---|
| 754 | your job with the \texttt{-rsh} option. If you like to use relative paths set the working |
|---|
| 755 | directory for the job using the \texttt{-wdir} option of mpiexec. |
|---|
| 756 | |
|---|
| 757 | \item \texttt{-smpdfile filename} |
|---|
| 758 | Specify the location of the smpd configuration file. The default is \texttt{\~/.smpd}. |
|---|
| 759 | This is a Unix only option. Under Windows the settings are stored in the Windows Registry. |
|---|
| 760 | |
|---|
| 761 | \item \texttt{-timeout seconds} |
|---|
| 762 | Specify the maximum number of seconds the job is allowed to run. At the end of |
|---|
| 763 | the timeout period, if the job has not already exited then all processes will |
|---|
| 764 | be killed. |
|---|
| 765 | |
|---|
| 766 | \item \texttt{-user n} |
|---|
| 767 | Specify which encrypted credentials to retrieve from the Registry. The corresponding |
|---|
| 768 | entry must have been previously saved using the \texttt{-register -user n} option. |
|---|
| 769 | |
|---|
| 770 | \item \texttt{-validate [-user n] [-host hostname]} |
|---|
| 771 | Validate that the saved credentials can be used to launch a process on the local |
|---|
| 772 | or specified host. If more that one credentials has been saved then the |
|---|
| 773 | \texttt{-user} option can be used to select which user credentials to use. |
|---|
| 774 | |
|---|
| 775 | \item \texttt{-verbose} |
|---|
| 776 | Output trace data for mpiexec. Only useful for debugging. |
|---|
| 777 | |
|---|
| 778 | \item \texttt{-wdir drive:$\backslash$my$\backslash$working$\backslash$directory} |
|---|
| 779 | \texttt{-wdir} and \texttt{-dir} are synonyms. |
|---|
| 780 | |
|---|
| 781 | \item \texttt{-whoami} |
|---|
| 782 | Print out the current user name in the format that mpiexec and smpd expect it to be. |
|---|
| 783 | This is useful for users who use a screen name that is different from their user |
|---|
| 784 | name. |
|---|
| 785 | |
|---|
| 786 | \end{itemize} |
|---|
| 787 | |
|---|
| 788 | \subsection{SMPD process manager options} |
|---|
| 789 | This section describes some of the options for the smpd process manager. |
|---|
| 790 | |
|---|
| 791 | smpd.exe runs as a service under Windows. This is required so that it can start |
|---|
| 792 | processes under multiple user credentials. Only services have the privileges |
|---|
| 793 | necessary to log on users and start processes for them. Since this is a privileged |
|---|
| 794 | operation administrator rights are required to install the smpd service. This is |
|---|
| 795 | what the default installer package does. |
|---|
| 796 | |
|---|
| 797 | But smpd can be run in other ways for debugging or single user use. |
|---|
| 798 | |
|---|
| 799 | If you have smpd.exe installed first execute \texttt{smpd.exe -stop} to stop the |
|---|
| 800 | service. |
|---|
| 801 | |
|---|
| 802 | Then you can run it by hand for single user mode or for debugging. The flag for |
|---|
| 803 | debugging single user mode is \texttt{-d debug\_output\_level}. |
|---|
| 804 | |
|---|
| 805 | If you run it like this you will get full trace output: |
|---|
| 806 | \begin{verbatim} |
|---|
| 807 | smpd.exe -d |
|---|
| 808 | \end{verbatim} |
|---|
| 809 | |
|---|
| 810 | If you run it like this you will get no output except for errors: |
|---|
| 811 | \begin{verbatim} |
|---|
| 812 | smpd.exe -d 0 |
|---|
| 813 | \end{verbatim} |
|---|
| 814 | |
|---|
| 815 | Here are all the options to smpd.exe: |
|---|
| 816 | \begin{itemize} |
|---|
| 817 | \item \texttt{-install or -regserver} |
|---|
| 818 | Install the smpd service. Requires administrator privileges. |
|---|
| 819 | \item \texttt{-remove or -uninstall or -unregserver} |
|---|
| 820 | Uninstall the smpd service. Requires administrator privileges. |
|---|
| 821 | \item \texttt{-start} |
|---|
| 822 | Start the smpd service. Requires administrator privileges. |
|---|
| 823 | \item \texttt{-stop} |
|---|
| 824 | Stop the smpd service. Requires administrator privileges. |
|---|
| 825 | \item \texttt{-restart} |
|---|
| 826 | Stop and restart the smpd service. Requires administrator privileges. |
|---|
| 827 | \item \texttt{-register\_spn} |
|---|
| 828 | Register the Service Prinicipal Name for the smpd service of the local machine |
|---|
| 829 | on the domain controller. Requires DOMAIN administrator privileges. This is |
|---|
| 830 | used in conjunction with passwordless SSPI authentication described in the |
|---|
| 831 | section on security. |
|---|
| 832 | \item \texttt{-remove\_spn} |
|---|
| 833 | Remove the Service Prinicipal Name from the domain controller for the smpd service |
|---|
| 834 | of the local machine. Requires DOMAIN administrator privileges. |
|---|
| 835 | \item \texttt{-traceon filename [hostA hostB ...]} |
|---|
| 836 | Turn on the trace logging of the smpd service on the local or specified hosts |
|---|
| 837 | and set the output to the specified file. The file location must be available |
|---|
| 838 | on the local drive of each of the hosts. It cannot be located on a remote |
|---|
| 839 | machine. |
|---|
| 840 | \item \texttt{-traceoff [hostA hostB ...]} |
|---|
| 841 | Turn off the trace logging of the smpd service on the local or specified hosts. |
|---|
| 842 | \item \texttt{-port n} |
|---|
| 843 | Listen on the specified port number. If this option is not specified then smpd |
|---|
| 844 | listens on the default port (8676). |
|---|
| 845 | \item \texttt{-anyport} |
|---|
| 846 | Listen on any port assigned by the OS. smpd immediately prints out the port that |
|---|
| 847 | it has been assigned. |
|---|
| 848 | \item \texttt{-phrase passphrase} |
|---|
| 849 | Use the specified passphrase to authenticate connections to the smpd either by |
|---|
| 850 | mpiexec or another smpd process. |
|---|
| 851 | \item \texttt{-getphrase} |
|---|
| 852 | Prompt the user to input the passphrase. This is useful if you don't want to |
|---|
| 853 | specify the phrase on the command line. |
|---|
| 854 | \item \texttt{-noprompt} |
|---|
| 855 | Don't prompt the user for input. If there is missing information, print an error |
|---|
| 856 | and exit. |
|---|
| 857 | \item \texttt{-set option value} |
|---|
| 858 | Set the smpd option to the specified value. For example, \texttt{smpd -set logfile c:$\backslash$temp$\backslash$smpd.log} will set the log file to the |
|---|
| 859 | specified file name. \texttt{smpd -set log yes} will turn trace logging on and |
|---|
| 860 | \texttt{smpd -set log no} will turn it off. |
|---|
| 861 | \item \texttt{-get option} |
|---|
| 862 | Print out the value of the specified smpd option. |
|---|
| 863 | \item \texttt{-hosts} |
|---|
| 864 | Print the hosts that mpiexec and this smpd will use to launch processes on. |
|---|
| 865 | If the list is empty then processes will be launched on the local host only. |
|---|
| 866 | \item \texttt{-sethosts hostA hostB ...} |
|---|
| 867 | Set the hosts option to a list of hosts that mpiexec and smpd will use to launch |
|---|
| 868 | processes on. |
|---|
| 869 | \item \texttt{-d [level] or -debug [level]} |
|---|
| 870 | Start the smpd in debug or single user mode with the optionally specified amount |
|---|
| 871 | of output. For example, \texttt{smpd -d} will start the smpd will lots of trace |
|---|
| 872 | output and \texttt{smpd -d 0} will start the smpd with no output except for errors. |
|---|
| 873 | \item \texttt{-s} |
|---|
| 874 | Only available on Unix systems. This option starts the smpd in single user daemon |
|---|
| 875 | mode for the current user. |
|---|
| 876 | \item \texttt{-smpdfile filename} |
|---|
| 877 | On Unix systems the smpd options are stored in a file that is readable only by |
|---|
| 878 | the current user (chmod 600). This file stores the same information that would |
|---|
| 879 | be stored in the Windows registry like the port and passphrase. The default |
|---|
| 880 | file is named \texttt{\~/.smpd} if this option is not specified. |
|---|
| 881 | \item \texttt{-shutdown} |
|---|
| 882 | Shutdown a running smpd that was started by \texttt{smpd -s} or \texttt{smpd -d}. |
|---|
| 883 | \item \texttt{-printprocs} |
|---|
| 884 | On a Windows machine you can run \texttt{smpd -printprocs} and it will print out |
|---|
| 885 | the processes started and stopped by smpd on the current host. The format of the |
|---|
| 886 | output is \texttt{+/-pid cmd}. Plus means a process was started and minus means |
|---|
| 887 | the process has exited. The process id is specified next and then the rest of the line |
|---|
| 888 | is the command that was launched. |
|---|
| 889 | \item \texttt{-enum or -enumerate} |
|---|
| 890 | Print the smpd options set on the local host. |
|---|
| 891 | \item \texttt{-version} |
|---|
| 892 | Print the smpd version and exit. |
|---|
| 893 | \item \texttt{-status [-host hostname]} |
|---|
| 894 | Print the status of the smpd on the local or specified host. |
|---|
| 895 | \item \texttt{-help} |
|---|
| 896 | Print a brief summary of the options to smpd. |
|---|
| 897 | \end{itemize} |
|---|
| 898 | |
|---|
| 899 | \subsection{Debugging jobs by starting them manually} |
|---|
| 900 | This section describes how to start a job by hand without the use of a process |
|---|
| 901 | manager so the job can be stepped through with a debugger. |
|---|
| 902 | |
|---|
| 903 | You can launch an MPICH2 job by hand if you set the minimum required environment |
|---|
| 904 | variables for each process and then start the processes yourself (or in a debugger). |
|---|
| 905 | |
|---|
| 906 | Here is a script that sets the environment variables so that a job can be started |
|---|
| 907 | on the local machine: |
|---|
| 908 | The file is called \texttt{setmpi2.bat} |
|---|
| 909 | \begin{verbatim} |
|---|
| 910 | if ``%1'' == ``'' goto HELP |
|---|
| 911 | if ``%2'' == ``'' goto HELP |
|---|
| 912 | set PMI_ROOT_HOST=%COMPUTERNAME% |
|---|
| 913 | set PMI_ROOT_PORT=9222 |
|---|
| 914 | set PMI_ROOT_LOCAL=1 |
|---|
| 915 | set PMI_RANK=%1 |
|---|
| 916 | set PMI_SIZE=%2 |
|---|
| 917 | set PMI_KVS=mpich2 |
|---|
| 918 | goto DONE |
|---|
| 919 | :HELP |
|---|
| 920 | REM usage: setmpi2 rank size |
|---|
| 921 | :DONE |
|---|
| 922 | \end{verbatim} |
|---|
| 923 | |
|---|
| 924 | For example, to debug a two process job bring up two separate command prompts. |
|---|
| 925 | In the first prompt execute \texttt{setmpi2.bat 0 2} and in the second prompt |
|---|
| 926 | execute \texttt{setmpi2.bat 1 2}. Then run your application always starting |
|---|
| 927 | the root process first. The root process must call MPI\_Init before any of the |
|---|
| 928 | other processes because it is the process that listens on the port specifed by |
|---|
| 929 | the environment variable PMI\_ROOT\_PORT. Simply execute \texttt{myapp.exe} from |
|---|
| 930 | each command prompt to run your job. Or better yet run each process in a debugger. |
|---|
| 931 | If you have the Microsoft developer studio installed you can run the following |
|---|
| 932 | from each command prompt: \texttt{devenv.exe myapp.exe}. This will bring up a |
|---|
| 933 | debugger for each process. Then you can step through each process and debug it. |
|---|
| 934 | Remember that the first process must call MPI\_Init before any of the rest of the |
|---|
| 935 | processes do. You can restart the processes at any time as long as you restart |
|---|
| 936 | all of them. |
|---|
| 937 | |
|---|
| 938 | The script can be modified to launch on multiple hosts by changing the line: |
|---|
| 939 | \begin{verbatim} |
|---|
| 940 | set PMI_ROOT_HOST=%COMPUTERNAME% |
|---|
| 941 | \end{verbatim} |
|---|
| 942 | to set the variable to the hostname where the root process will be started instead |
|---|
| 943 | of the local host name. |
|---|
| 944 | |
|---|
| 945 | The limitation of this method of starting processes is that MPI-2 spawning operations |
|---|
| 946 | are not supported. If your application calls MPI\_Comm\_spawn it will produce |
|---|
| 947 | an error. |
|---|
| 948 | |
|---|
| 949 | \subsection{Environment variables} |
|---|
| 950 | This section describes the environment variables used by MPICH2 and smpd. |
|---|
| 951 | |
|---|
| 952 | \begin{itemize} |
|---|
| 953 | \item \texttt{MPICH\_ABORT\_ON\_ERROR} |
|---|
| 954 | Call abort() when an error happens instead of returning an error and calling MPID\_Abort. useful for unix where calling abort() creates a core file. |
|---|
| 955 | \item \texttt{MPICH\_PRINT\_ERROR\_STACK} |
|---|
| 956 | Print the entire error stack when an error occurs (currently this is the default) |
|---|
| 957 | \item \texttt{MPICH\_CHOP\_ERROR\_STACK} |
|---|
| 958 | Split the error stack output at the character position specified. A value of 79 |
|---|
| 959 | would cause carriage returns to be inserted after the 79th character. |
|---|
| 960 | \item \texttt{MPICH\_WARNINGS} |
|---|
| 961 | Print runtime warnings (unmatched messages at MPI\_Finalize, unreleased resources, etc) |
|---|
| 962 | \item \texttt{MPICH\_SOCKET\_BUFFER\_SIZE} |
|---|
| 963 | socket buffer size |
|---|
| 964 | \item \texttt{MPICH\_SOCKET\_RBUFFER\_SIZE} |
|---|
| 965 | socket receive buffer size |
|---|
| 966 | \item \texttt{MPICH\_SOCKET\_SBUFFER\_SIZE} |
|---|
| 967 | socket send buffer size |
|---|
| 968 | \item \texttt{MPICH\_SOCKET\_NUM\_PREPOSTED\_ACCEPTS} |
|---|
| 969 | number of accepts posted for MPIDU\_Sock\_listen |
|---|
| 970 | \item \texttt{MPICH\_PORT\_RANGE} |
|---|
| 971 | Range of ports to use for sockets: min..max or min,max |
|---|
| 972 | \item \texttt{MPICH\_INTERFACE\_HOSTNAME} |
|---|
| 973 | hostname to use to connect sockets |
|---|
| 974 | \item \texttt{MPICH\_NETMASK} |
|---|
| 975 | bitmask to select an ip subnet: ip/numbits, ie 192.0.0.0/8 |
|---|
| 976 | \item \texttt{MPIEXEC\_TIMEOUT} |
|---|
| 977 | job timeout in seconds |
|---|
| 978 | \item \texttt{MPIEXEC\_LOCALONLY} |
|---|
| 979 | launch job processes on the local machine only |
|---|
| 980 | \item \texttt{MPIEXEC\_NOPROMPT} |
|---|
| 981 | Don't prompt for user input for missing information, print an error instead. |
|---|
| 982 | \item \texttt{MPIEXEC\_SMPD\_PORT} |
|---|
| 983 | Connect to smpd on the specified port. |
|---|
| 984 | |
|---|
| 985 | The following two only affect mpiexec for smpd if -rsh is on the command line: |
|---|
| 986 | \item \texttt{MPIEXEC\_RSH} |
|---|
| 987 | rsh command to use, default is ``ssh -x'' |
|---|
| 988 | \item \texttt{MPIEXEC\_RSH\_NO\_ESCAPE} |
|---|
| 989 | create an rsh command compatible with Cygwin's ssh |
|---|
| 990 | \item \texttt{MPICH\_SPN} |
|---|
| 991 | Service Principal Name used for passwordless authentication |
|---|
| 992 | \item \texttt{SMPD\_DBG\_OUTPUT} |
|---|
| 993 | Print debugging output |
|---|
| 994 | \item \texttt{SMPD\_DBG\_LOG\_FILENAME} |
|---|
| 995 | name of logfile to send output to |
|---|
| 996 | \item \texttt{SMPD\_MAX\_LOG\_FILE\_SIZE} |
|---|
| 997 | maximum number of bytes the logfile can grow to before it is truncated |
|---|
| 998 | \item \texttt{MPICH\_DBG\_OUTPUT} |
|---|
| 999 | stdout, memlog or file. determines where debugging output goes |
|---|
| 1000 | \item \texttt{MPI\_DLL\_NAME} |
|---|
| 1001 | name of the dll that contains the MPI and PMPI interfaces |
|---|
| 1002 | \item \texttt{MPICH2\_CHANNEL} |
|---|
| 1003 | short name of the channel used to create the full name of the MPI dll (ie. ib becomes mpich2ib.dll) |
|---|
| 1004 | \item \texttt{MPI\_WRAP\_DLL\_NAME} |
|---|
| 1005 | name of the dll that contains only the MPI interface, not the PMPI interface |
|---|
| 1006 | \item \texttt{MPICH\_TRMEM\_INITZERO} |
|---|
| 1007 | used by the memory tracing package |
|---|
| 1008 | \item \texttt{MPICH\_TRMEM\_VALIDATE} |
|---|
| 1009 | used by the memory tracing package |
|---|
| 1010 | \item \texttt{MPITEST\_DEBUG} |
|---|
| 1011 | used by the test suite |
|---|
| 1012 | \item \texttt{MPITEST\_VERBOSE} |
|---|
| 1013 | used by the test suite |
|---|
| 1014 | \item \texttt{PATH} |
|---|
| 1015 | used by smpd to search for executables under Unix. |
|---|
| 1016 | \end{itemize} |
|---|
| 1017 | |
|---|
| 1018 | SMPD options specified on the command line can also be specified in the environment |
|---|
| 1019 | by prefixing \texttt{SMPD\_OPTION\_} to the option name and saving it as an |
|---|
| 1020 | environment variable. |
|---|
| 1021 | \begin{itemize} |
|---|
| 1022 | \item \texttt{SMPD\_OPTION\_APP\_PATH} |
|---|
| 1023 | \item \texttt{SMPD\_OPTION\_LOGFILE} |
|---|
| 1024 | \item \texttt{SMPD\_OPTION\_NOCACHE} |
|---|
| 1025 | \item \texttt{SMPD\_OPTION\_PHRASE} |
|---|
| 1026 | \item \texttt{SMPD\_OPTION\_SSPI\_PROTECT} |
|---|
| 1027 | \item \texttt{SMPD\_OPTION\_MAX\_LOGFILE\_SIZE} |
|---|
| 1028 | \item \texttt{SMPD\_OPTION\_PLAINTEXT} |
|---|
| 1029 | \item \texttt{SMPD\_OPTION\_PORT} |
|---|
| 1030 | \item \texttt{SMPD\_OPTION\_TIMEOUT} |
|---|
| 1031 | \item \texttt{SMPD\_OPTION\_EXITCODES} |
|---|
| 1032 | \item \texttt{SMPD\_OPTION\_PRIORITY} |
|---|
| 1033 | \item \texttt{SMPD\_OPTION\_LOCALONLY} |
|---|
| 1034 | \item \texttt{SMPD\_OPTION\_NOPROMPT} |
|---|
| 1035 | \item \texttt{SMPD\_OPTION\_CHANNEL} |
|---|
| 1036 | \item \texttt{SMPD\_OPTION\_HOSTS} |
|---|
| 1037 | \item \texttt{SMPD\_OPTION\_DELEGATE} |
|---|
| 1038 | \item \texttt{SMPD\_OPTION\_INTERNODE\_CHANNEL} |
|---|
| 1039 | \item \texttt{SMPD\_OPTION\_LOG} |
|---|
| 1040 | \item \texttt{SMPD\_OPTION\_NO\_DYNAMIC\_HOSTS} |
|---|
| 1041 | \end{itemize} |
|---|
| 1042 | |
|---|
| 1043 | Variables to control debugging output when enabled: |
|---|
| 1044 | \begin{itemize} |
|---|
| 1045 | \item \texttt{MPICH\_DBG} |
|---|
| 1046 | \item \texttt{MPICH\_DBG\_CLASS} |
|---|
| 1047 | \item \texttt{MPICH\_DBG\_FILENAME} |
|---|
| 1048 | \item \texttt{MPICH\_DBG\_LEVEL} |
|---|
| 1049 | \item \texttt{MPICH\_DBG\_OUTPUT} |
|---|
| 1050 | \item \texttt{MPICH\_DBG\_RANK} |
|---|
| 1051 | \item \texttt{MPICH\_DEBUG\_ITEM} |
|---|
| 1052 | \end{itemize} |
|---|
| 1053 | |
|---|
| 1054 | The following variables affect the MPE logging library: |
|---|
| 1055 | \begin{itemize} |
|---|
| 1056 | \item \texttt{MPE\_LOGFILE\_PREFIX} |
|---|
| 1057 | name of the clog file to create without the extension |
|---|
| 1058 | \item \texttt{MPE\_DELETE\_LOCALFILE} |
|---|
| 1059 | true,false - delete or not the local clog file |
|---|
| 1060 | \item \texttt{MPE\_LOG\_OVERHEAD} |
|---|
| 1061 | I think this one adds an event to the clog files representing the time it takes to write a clog buffer to disk |
|---|
| 1062 | \item \texttt{CLOG\_BLOCK\_SIZE} |
|---|
| 1063 | number of bytes in a clog block |
|---|
| 1064 | \item \texttt{CLOG\_BUFFERED\_BLOCKS} |
|---|
| 1065 | number of blocks |
|---|
| 1066 | \item \texttt{MPE\_CLOCKS\_SYNC} |
|---|
| 1067 | yes/no - synchronize clocks |
|---|
| 1068 | |
|---|
| 1069 | directories to store temporary files: |
|---|
| 1070 | \item \texttt{MPE\_TMPDIR} |
|---|
| 1071 | \item \texttt{TMPDIR} |
|---|
| 1072 | \item \texttt{TMP} |
|---|
| 1073 | \item \texttt{TEMP} |
|---|
| 1074 | \end{itemize} |
|---|
| 1075 | |
|---|
| 1076 | PMI environment variables created by smpd are described in the smpd documentation: |
|---|
| 1077 | \begin{itemize} |
|---|
| 1078 | \item \texttt{PMI\_DLL\_NAME} |
|---|
| 1079 | name of the PMI dll to load (replaces the default smpd functions) |
|---|
| 1080 | \item \texttt{PMI\_NAMEPUB\_KVS} |
|---|
| 1081 | name of the key-val-space where MPI service names are stored for MPI\_Lookup\_name() |
|---|
| 1082 | \item \texttt{PMI\_ROOT\_HOST} |
|---|
| 1083 | \item \texttt{PMI\_ROOT\_PORT} |
|---|
| 1084 | \item \texttt{PMI\_ROOT\_LOCAL} |
|---|
| 1085 | \item \texttt{PMI\_SPAWN} |
|---|
| 1086 | \item \texttt{PMI\_KVS} |
|---|
| 1087 | \item \texttt{PMI\_DOMAIN} |
|---|
| 1088 | \item \texttt{PMI\_RANK} |
|---|
| 1089 | \item \texttt{PMI\_SIZE} |
|---|
| 1090 | \item \texttt{PMI\_CLIQUE} |
|---|
| 1091 | \item \texttt{PMI\_APPNUM} |
|---|
| 1092 | \item \texttt{PMI\_SMPD\_ID} |
|---|
| 1093 | \item \texttt{PMI\_SMPD\_KEY} |
|---|
| 1094 | \item \texttt{PMI\_SMPD\_FD} |
|---|
| 1095 | \item \texttt{PMI\_HOST} |
|---|
| 1096 | \item \texttt{PMI\_PORT} |
|---|
| 1097 | \item \texttt{PMI\_APPNUM} |
|---|
| 1098 | \end{itemize} |
|---|
| 1099 | |
|---|
| 1100 | Used by the process managers other than smpd: |
|---|
| 1101 | \begin{itemize} |
|---|
| 1102 | \item \texttt{MPIEXEC\_DEBUG} |
|---|
| 1103 | \item \texttt{MPIEXEC\_MACHINES\_PATH} |
|---|
| 1104 | \item \texttt{MPIEXEC\_PORTRANGE} |
|---|
| 1105 | \item \texttt{MPIEXEC\_PREFIX\_STDERR} |
|---|
| 1106 | \item \texttt{MPIEXEC\_PREFIX\_STDOUT} |
|---|
| 1107 | \item \texttt{MPIEXEC\_REMSHELL} |
|---|
| 1108 | \item \texttt{MPIEXEC\_USE\_PORT} |
|---|
| 1109 | \end{itemize} |
|---|
| 1110 | |
|---|
| 1111 | \subsection{Compiling} |
|---|
| 1112 | This section describes how to set up a project to compile an MPICH2 application |
|---|
| 1113 | using Visual Studio 2005 and Visual Studio 6.0. |
|---|
| 1114 | |
|---|
| 1115 | \subsubsection{Visual Studio 6.0} |
|---|
| 1116 | Visual C++ 6.0 cannot handle multiple functions with the same type signature |
|---|
| 1117 | that only differ in their return type. So you must define \texttt{HAVE\_NO\_VARIABLE\_RETURN\_TYPE\_SUPPORT} |
|---|
| 1118 | in your project. |
|---|
| 1119 | |
|---|
| 1120 | \begin{enumerate} |
|---|
| 1121 | \item Create a project and add your source files. |
|---|
| 1122 | |
|---|
| 1123 | \item Bring up the settings for the project by hitting Alt F7. Select the Preprocessor |
|---|
| 1124 | Category from the C/C++ tab. Enter \texttt{HAVE\_NO\_VARIABLE\_RETURN\_TYPE\_SUPPORT} into |
|---|
| 1125 | the Preprocessor box. Enter \texttt{C:$\backslash$Program Files$\backslash$MPICH2$\backslash$include} |
|---|
| 1126 | into the ``Additional include directories'' box. |
|---|
| 1127 | |
|---|
| 1128 | \item Select the Input Category from the Link tab. Add \texttt{cxx.lib} and \texttt{mpi.lib} to |
|---|
| 1129 | the Object/library modules box. Add \texttt{C:$\backslash$Program Files$\backslash$MPICH2$\backslash$lib} |
|---|
| 1130 | to the ``Additional library path'' box. |
|---|
| 1131 | |
|---|
| 1132 | \item Compile your application. |
|---|
| 1133 | \end{enumerate} |
|---|
| 1134 | |
|---|
| 1135 | \subsubsection{Visual Studio 2005} |
|---|
| 1136 | You can use the example projects provided with Visual Studio 2005 and use |
|---|
| 1137 | it as a guide to create your own projects. |
|---|
| 1138 | |
|---|
| 1139 | \begin{enumerate} |
|---|
| 1140 | \item Create a project and add your source files. |
|---|
| 1141 | |
|---|
| 1142 | \item Bring up the properties dialog for your project by right clicking the project |
|---|
| 1143 | name and selecting Properties. |
|---|
| 1144 | |
|---|
| 1145 | \item Navigate to Configuration Properties::C/C++::General |
|---|
| 1146 | \item Add \texttt{C:$\backslash$Program Files$\backslash$MPICH2$\backslash$include} |
|---|
| 1147 | to the ``Additional Include Directories'' box. |
|---|
| 1148 | |
|---|
| 1149 | \item Navigate to Configuration Properties::Linker::General |
|---|
| 1150 | \item Add \texttt{C:$\backslash$Program Files$\backslash$MPICH2$\backslash$lib} |
|---|
| 1151 | to the ``Aditional Library Directories'' box. |
|---|
| 1152 | |
|---|
| 1153 | \item Navigate to Configuration Properties::Linker::Input |
|---|
| 1154 | \item Add \texttt{cxx.lib} and \texttt{mpi.lib} and \texttt{fmpich2.lib} to the ``Additional Dependencies'' box. If your |
|---|
| 1155 | application is a C application then it only needs \texttt{mpi.lib}. If it is a C++ application then it |
|---|
| 1156 | needs both \texttt{cxx.lib} and \texttt{mpi.lib}. If it is a Fortran application then it only needs one of the \texttt{fmpich2[s,g].lib} libraries. |
|---|
| 1157 | The fortran library comes in three flavors \texttt{fmpich2.lib}, \texttt{fmpich2s.lib} and \texttt{fmpich2s.lib}. \texttt{fmpich2.lib} |
|---|
| 1158 | contains all uppercase symbols and uses the C calling convention like this: \texttt{MPI\_INIT}. \texttt{fmpich2s.lib} |
|---|
| 1159 | contains all uppercase symbols and uses the stdcall calling convention like this: \texttt{MPI\_INIT@4}. |
|---|
| 1160 | \texttt{fmpich2g.lib} contains all lowercase symbols with double underscores and the C calling convention |
|---|
| 1161 | like this: \texttt{mpi\_init\_\_}. Add the library that matches your Fortran compiler. |
|---|
| 1162 | |
|---|
| 1163 | \item Compile your application. |
|---|
| 1164 | \end{enumerate} |
|---|
| 1165 | |
|---|
| 1166 | \subsubsection{Cygwin gcc} |
|---|
| 1167 | You can compile your MPI programs using gcc/g++ from Cygwin and the MPICH2 header files/libraries installed with MPICH2 on windows. Compile using the header files in \texttt{C:$\backslash$Program Files$\backslash$MPICH2$\backslash$include} |
|---|
| 1168 | and link using the libs, lib*.a, in \texttt{C:$\backslash$Program Files$\backslash$MPICH2$\backslash$lib}. Note that you should use the ``-localroot'' option when running programs compiled using gcc/g++ from Cygwin. |
|---|
| 1169 | |
|---|
| 1170 | \subsection{Performance Analysis} |
|---|
| 1171 | MPICH2 includes the Multi-Processing Environment (MPE), which is a |
|---|
| 1172 | suite of performance analysis tools comprising profiling libraries, |
|---|
| 1173 | utility programs, a set of graphical tools, and a collective checking library. |
|---|
| 1174 | |
|---|
| 1175 | The first set of tools to be used with user MPI programs is profiling libraries |
|---|
| 1176 | which provide a collection of routines that create log files. These log files |
|---|
| 1177 | can be created manually by inserting MPE calls in the MPI program, or |
|---|
| 1178 | automatically by linking with the appropriate MPE libraries, or by combining |
|---|
| 1179 | the above two methods. Currently, MPE offers the following four profiling |
|---|
| 1180 | libraries. |
|---|
| 1181 | |
|---|
| 1182 | \begin{enumerate} |
|---|
| 1183 | \item Tracing Library: Traces all MPI calls. Each MPI call is preceded by a |
|---|
| 1184 | line that contains the rank in \texttt{MPI\_COMM\_WORLD} of the calling process, |
|---|
| 1185 | and followed by another line indicating that the call has completed. |
|---|
| 1186 | Most send and receive routines also indicate the values of count, tag, |
|---|
| 1187 | and partner (destination for sends, source for receives). Output is to |
|---|
| 1188 | standard output. |
|---|
| 1189 | |
|---|
| 1190 | \item Animation Libraries: A simple form of real-time program animation |
|---|
| 1191 | that requires X window routines (Currently not available on windows). |
|---|
| 1192 | |
|---|
| 1193 | \item Logging Libraries: The most useful and widely used profiling libraries |
|---|
| 1194 | in MPE. These libraries form the basis for generating log files from |
|---|
| 1195 | user MPI programs. Several different log file formats are |
|---|
| 1196 | available in MPE. The default log file format is CLOG2. It is a low |
|---|
| 1197 | overhead logging format, a simple collection of single timestamp events. |
|---|
| 1198 | The old format ALOG, which is not being developed for years, is not |
|---|
| 1199 | distributed here. The powerful visualization format is SLOG-2, stands |
|---|
| 1200 | for Scalable LOGfile format version II, which is a total redesign of the |
|---|
| 1201 | original SLOG format. SLOG-2 allows for much improved scalability for |
|---|
| 1202 | visualization purpose. A CLOG2 file can be easily converted to |
|---|
| 1203 | SLOG-2 file through the new SLOG-2 viewer, Jumpshot-4. |
|---|
| 1204 | |
|---|
| 1205 | \item Collective and datatype checking library: An argument consistency |
|---|
| 1206 | checking library for MPI collective calls. It checks for datatype, root, |
|---|
| 1207 | and various argument consistency in MPI collective calls (Currently not |
|---|
| 1208 | available on Windows). |
|---|
| 1209 | \end{enumerate} |
|---|
| 1210 | |
|---|
| 1211 | The set of utility programs in MPE includes log format converter (e.g. |
|---|
| 1212 | clogTOslog2) and logfile viewer and convertor (e.g. Jumpshot). These new |
|---|
| 1213 | tools, clog2TOslog2 and Jumpshot (Jumpshot-4) replace old tools, clog2slog, |
|---|
| 1214 | slog\_print and logviewer (i.e. Jumpshot-2 and Jumpshot-3). |
|---|
| 1215 | |
|---|
| 1216 | \subsubsection{Tracing MPI calls using the MPIEXEC Wrapper} |
|---|
| 1217 | A developer can trace MPI calls by using the tracing functionality of the mpiexec |
|---|
| 1218 | wrapper (wmpiexec.exe). A step by step process is given below. |
|---|
| 1219 | |
|---|
| 1220 | \begin{enumerate} |
|---|
| 1221 | \item Launch the mpiexec wrapper application (wmpiexec.exe). |
|---|
| 1222 | |
|---|
| 1223 | \item After launching the mpiexec wrapper, select the application that you would |
|---|
| 1224 | like to run and select the number of |
|---|
| 1225 | processes. Now click on the ``more options'' |
|---|
| 1226 | checkbox to show the extended options for mpiexec. |
|---|
| 1227 | |
|---|
| 1228 | \item Check the ``produce clog2 file'' checkbox so that the clog2 file is generated |
|---|
| 1229 | when the application is run. |
|---|
| 1230 | |
|---|
| 1231 | \item Check ``run in an separate window'' checkbox to enable your program to run in |
|---|
| 1232 | a separate window (for user interaction). |
|---|
| 1233 | |
|---|
| 1234 | \item Run your application by clicking on the ``Execute'' button. |
|---|
| 1235 | |
|---|
| 1236 | \item Once the application exits, click on the ``Jumpshot'' button to launch Jumpshot |
|---|
| 1237 | (the logfile viewer). |
|---|
| 1238 | |
|---|
| 1239 | \item Open your logfile (the default name of the logfile is \texttt{<APPLICATION NAME>.clog2}) |
|---|
| 1240 | using Jumpshot. Jumpshot will ask for converting the logfile to slog2 format. |
|---|
| 1241 | Click ``Convert'' button in Jumpshot to convert your logfile to slog2 format. |
|---|
| 1242 | |
|---|
| 1243 | \item Now click on ``OK'' button in Jumpshot to view the logfile. |
|---|
| 1244 | \end{enumerate} |
|---|
| 1245 | |
|---|
| 1246 | \subsubsection{Tracing MPI calls from the command line} |
|---|
| 1247 | |
|---|
| 1248 | \begin{enumerate} |
|---|
| 1249 | \item Run your application using the ``-log'' option to the mpiexec command. |
|---|
| 1250 | |
|---|
| 1251 | \item Launch Jumpshot using the java command. |
|---|
| 1252 | (eg: \texttt{java -jar "c:$\backslash$Program Files$\backslash$MPICH2$\backslash$bin$\backslash$jumpshot.jar"}) |
|---|
| 1253 | \item Follow the steps mentioned in the previous section to convert the logfile to slog2 |
|---|
| 1254 | format and view the log. |
|---|
| 1255 | \end{enumerate} |
|---|
| 1256 | |
|---|
| 1257 | \subsubsection{Customizing logfiles} |
|---|
| 1258 | In addition to using the predefined MPE logging libraries to log all MPI |
|---|
| 1259 | calls, MPE logging calls can be inserted into the user's MPI program to define |
|---|
| 1260 | and log states. These states are called user-defined states. States may |
|---|
| 1261 | be nested, allowing one to define a state describing a user routine that |
|---|
| 1262 | contains several MPI calls, and display both the user-defined state and |
|---|
| 1263 | the MPI operations contained within it. |
|---|
| 1264 | |
|---|
| 1265 | The simplest way to insert user-defined states is as follows: |
|---|
| 1266 | \begin{enumerate} |
|---|
| 1267 | \item Get handles from MPE logging library. \texttt{MPE\_Log\_get\_state\_eventIDs} |
|---|
| 1268 | must be used to get unique event IDs (MPE logging handles). |
|---|
| 1269 | This is important if you are writing a library that uses |
|---|
| 1270 | the MPE logging routines from the MPE system. |
|---|
| 1271 | Hardwiring the eventIDs is considered a bad idea since it may cause |
|---|
| 1272 | eventID confict and so the practice isn't supported. The older MPE library |
|---|
| 1273 | provides \texttt{MPE\_Log\_get\_event\_number}, which is still being supported but |
|---|
| 1274 | has been deprecated; users are strongly urged to use |
|---|
| 1275 | \texttt{MPE\_Log\_get\_state\_eventIDs} instead. |
|---|
| 1276 | |
|---|
| 1277 | \item Set the logged state's characteristics. \texttt{MPE\_Describe\_state} sets the |
|---|
| 1278 | name and color of the states. |
|---|
| 1279 | |
|---|
| 1280 | \item Log the events of the logged states. \texttt{MPE\_Log\_event} is called twice |
|---|
| 1281 | to log the user-defined states. |
|---|
| 1282 | \end{enumerate} |
|---|
| 1283 | |
|---|
| 1284 | Below is a simple example that uses the three steps outlined above. |
|---|
| 1285 | |
|---|
| 1286 | \begin{verbatim} |
|---|
| 1287 | int eventID_begin, eventID_end; |
|---|
| 1288 | ... |
|---|
| 1289 | MPE_Log_get_state_eventIDs( &eventID_begin, &eventID_end ); |
|---|
| 1290 | ... |
|---|
| 1291 | MPE_Describe_state( eventID_begin, eventID_end, "Multiplication", "red" ); |
|---|
| 1292 | ... |
|---|
| 1293 | MyAmult( Matrix m, Vector v ) |
|---|
| 1294 | { |
|---|
| 1295 | /* Log the start event along with the size of the matrix */ |
|---|
| 1296 | MPE_Log_event( eventID_begin, 0, NULL ); |
|---|
| 1297 | ... Amult code, including MPI calls ... |
|---|
| 1298 | MPE_Log_event( eventID_end, 0, NULL ); |
|---|
| 1299 | } |
|---|
| 1300 | \end{verbatim} |
|---|
| 1301 | |
|---|
| 1302 | The logfile generated by this code will have the MPI routines nested within |
|---|
| 1303 | the routine \texttt{MyAmult}. |
|---|
| 1304 | |
|---|
| 1305 | Besides user-defined state, MPE2 also provides support for user-defined |
|---|
| 1306 | events, which can be defined through use of \texttt{MPE\_Log\_get\_solo\_eventID} |
|---|
| 1307 | and \texttt{MPE\_Describe\_event}. For more details, see cpilog.c. |
|---|
| 1308 | |
|---|
| 1309 | For undefined user-defined state (where the corresponding \texttt{MPE\_Describe\_state} |
|---|
| 1310 | has not been issued), the new Jumpshot (Jumpshot-4) may display the legend name as |
|---|
| 1311 | ``UnknownType-INDEX'' where INDEX is the internal MPE category index. |
|---|
| 1312 | |
|---|
| 1313 | An example program, cpilog.c, is provided in the ``examples'' directory of your |
|---|
| 1314 | MPICH2 installation. This program can be used as a reference for customizing |
|---|
| 1315 | logfiles. |
|---|
| 1316 | |
|---|
| 1317 | \end{document} |
|---|