AMRMHD3D is a computer program for solving three-dimensional, compressible magnetohydrodynamics (MHD) problems using adaptive mesh refinement (AMR). It employs flux-corrected transport algorithms to integrate the MHD equations in a conservative, finite-volume representation, on structured blocks of Cartesian cells. The program has been optimized for high performance on the massively parallel Cray T3E computer system. The structure and input/output requirements of the program are detailed below.
AMRMHD3D is a new, three-dimensional, magnetohydrodynamics model
developed on the Cray T3E computer system. It extends our recently developed
FCTMHD3D model for fixed grids to
adaptively refined grids constructed by the parallel meshing package
PARAMESH.
The equations are solved conservatively in a finite-volume representation,
and explicit timestepping (two-step Runge-Kutta) is used to advance the
variables. The numerical developments and their application to selected
problems are described in the references below.
Scientific highlights of results obtained
with AMRMHD3D's predecessor code FCTMHD3D are available for
browsing, as is the documentation of the new code's
200-fold sustained speedup on the T3E
relative to the Cray C-90. The code itself is available via anonymous ftp
at
AMRMHD3D is a multiple subroutine package written in Fortran 90 which uses the Cray shared-memory communications library to exchange data between the processors. The full package consists of 79 Fortran source files and 27 header files in two groups: the FCTMHD3D application group (22 sources, 19 headers) and the PARAMESH amr group (57 sources, 8 headers). Both groups of files are included in the distribution; only the FCTMHD3D application group is documented here. The application group consists of the following Fortran source files:
amrmhd3d.F
Control the program flow and execution; open the formatted input and output files amrmhd3d.cnt, amrmhd3d.prt and amrmhd3d.inst; read the cntrol input data group.amr_bc_block.F
Set boundary conditions for each block of grid points.amr_test_refinement.F
Test for refinement of the adapted mesh.copy_lgrd_from_a.F
Copy grid quantities for the fluid solver from the AMR data structures into the temporary lcpfct3 arrays.copy_lgrd_to_a.F
Copy grid quantities for the fluid solver from the temporary lcpfct3 arrays into the AMR data structures.copy_mgrd_from_a.F
Copy grid quantities for the field solver from the AMR data structures into the temporary mhdfct3 arrays.copy_mgrd_to_a.F
Copy grid quantities for the field solver from the temporary mhdfct3 arrays into the AMR data structures.copy_psize_from_a.F
Copy grid sizes from the AMR data structures into the temporary amrmhd3d arrays.copy_psize_to_a.F
Copy grid sizes from the temporary amrmhd3d arrays into the AMR data structures.diagno.F
Perform diagnostics, evaluating the mass, momentum, and energy integrals.dtset.F
Determine the timestep using the Courant condition.from_mhd3.F
Copy the MHD variables from the temporary amrmhd3d arrays into the AMR data structures.from_trn3.F
Copy the auxiliary transport variables from the temporary amrmhd3d arrays into the AMR data structures.inital.F
Initialize the variable arrays and set up the physical problem to be solved; read the start input data group.newblock_grid.F
Construct the grid for each new block.timings_irtc.F
Convert for output timing information on the various parts of the AMR package.to_mhd3.F
Copy the MHD variables from the AMR data structures into the temporary amrmhd3d arrays.to_trn3.F
Copy the auxiliary transport variables from the AMR data structures into the temporary amrmhd3d arrays.trans.F
Perform the transport by solving the ideal MHD equations in conservative form with zero gravity; read the hydro input data group.
comms.shmem.F
comms_init
Initialize communications.comms_barrier_all
Set a barrier for all processors.comms_n_pes
Get the total number of processors allocated to the job.comms_my_pe
Get the processor number within the job.comms_bcast_real
Broadcast a real data array from a source processor to all processors.comms_bcast_integer
Broadcast an integer data array from a source processor to all processors.comms_bcast_logical
Broadcast a logical data array from a source processor to all processors.comms_reduce_real_sum
Reduce a sum of real data from all processors to a target processor.comms_reduce_integer_sum
Reduce a sum of integer data from all processors to a target processor.comms_reduce_real_max
Reduce a maximum of real data from all processors to a target processor.comms_reduce_real_min
Reduce a minimum of real data from all processors to a target processor.comms_allreduce_real_max
Reduce a maximum of real data from all processors to all processors.comms_allreduce_real_min
Reduce a minimum of real data from all processors to all processors.comms_reduce_integer_max
Reduce a maximum of integer data from all processors to a target processor.comms_reduce_integer_min
Reduce a minimum of integer data from all processors to a target processor.comms_allreduce_integer_max
Reduce a maximum of integer data from all processors to all processors.comms_allreduce_integer_min
Reduce a minimum of integer data from all processors to all processors.comms_sendrecv_fct
Send and receive data packets between processors for the FCT integrations.comms_sendrecv_real
Send and receive real arrays between processors.comms_sendrecv_integer
Send and receive integer arrays between processors.
lcpfct3.amr.F
lcpfct3
Integrate the generalized continuity equation for a flow scalar.lcpgrd3
Construct the necessary metric quantities for the lcpfct3 integrations.lcpsrc3
Construct the source terms for the lcpfct3 integrations.lcpcsm3
Evaluate the conservation sum for a flow scalar.
mhdfct3.amr.F
mhdfct3
Integrate the generalized hydromagnetic equation for a 3D magnetic field.mhdgrd3
Construct the necessary metric quantities for the mhdfct3 integrations.mhdsrc3
Construct the source terms for the mhdfct3 integrations.
The application group consists of the following header files, each of which contains datatype declarations and a common block declaration:
ageom1d.h
Common block a_geom1d; lcpfct3 metric data for all grid blocks.ageom1m.h
Common block a_geom1m; mhdfct3 metric data for all grid blocks.agridm.h
Common block a_gridm; field variable metric data for all grid blocks.agrids.h
Common block a_grids; flow variable metric data for all grid blocks.apsize.h
Common block a_psize; problem size data for all grid blocks.comms.h
Common block comms; communications data.const.h
Common block const; physical constants.fct3.h
Common block fct; work arrays for lcpfct3 and mhdfct3 on the grid block being integrated.geom1d.h
Common block geom1d; lcpfct3 metric data for the grid block being integrated.geom1m.h
Common block geom1m; mhdfct3 metric data for the grid block being integrated.gridm.h
Common block gridm; field variable metric data for the grid block being integrated.grids.h
Common block grids; flow variable metric data for the grid block being integrated.insts.h
Common block insts; instrumentation data for the code.lcpctl3.h
Common block lcpctl; boundary condition data for the lcpfct3 integrations.mhd3.h
Common block mhd; arrays for the basic MHD variables on the grid block being integrated.mhdctl3.h
Common block mhdctl; boundary condition data for the mhdfct3 integrations.psize.h
Common block psize; problem size data for the grid block being integrated.trn3.h
Common block trn; work arrays for the MHD timestepping in trans for the grid block being integrated.tstep.h
Common block tstep; time and timestep data.
One of the Cray Research benchlib libraries, lib_util.a, is required for the T3E implementation of the code and is included in the distribution. This library provides a prefetching routine pref which starts the flow of vectors of array data from fast memory to the processor before entering the loop which processes those data.
The control of the program flow and the physics inputs to the problem being solved by AMRMHD3D are passed in through the three input data groups cntrol, hydro, and start. These are contained in the formatted file amrmhd3d.cnt, which is assigned to Fortran unit 15. Examples of this file are provided with the code. The first line of the file is a label for the run. It is followed by the cntrol data group in the following format:
Spherical Blast Wave in a Magnetic Field
&cntrol
minstp maxstp time0 timef
1 26 0.0 100.
lrstrt lmfr
F F
idump idiag
0 1
dtdump dtdiag
0.0 0.0
maxref
6
&end
The text lines are read as 79-character lines and may be modified as desired, while the lines with numerical inputs are read as indefinitely formatted lines whose datatypes and sequences must match those above, but otherwise also may be modified. The other two input data groups follow the same standard.
The variables whose values are set by the cntrol input data group, their datatypes, and their functions in the code are as follows:
minstp
Integer; initial timestep counter.maxstp
Integer; final timestep counter.time0
Real; initial physical time.timef
Real; final physical time.lrstrt
Logical; is this calculation starting from an unformatted dataset?lmfr
Logical; does the unformatted restart dataset contain data at multiple time levels?idump
Integer; timestep interval for writing unformatted restart datasets.idiag
Integer; timestep interval for performing diagnostics.dtdump
Real; physical time interval for writing unformatted restart datasets.dtdiag
Real; physical time interval for performing diagnostics.maxref
Integer; maximum allowed number of refinement levels in the grid.
The calculation begins at timestep minstp and physical time time0; it ends when either timestep maxstp or time timef is reached. If the run is restarting from an unformatted dataset, the requested timestep minstp must match exactly that in the restart dataset, and the time time0 must match within 1%. Restart files will be written at timestep multiples of idump or physical time multiples of dtdump, or both, as specified. Zero values for either of these parameters prohibits a write of its type. If lmfr is false, the restart files are rewound after each write. If true, subsequent writes are appended to the end of the existing file. Finally, the grid may be refined locally up to maxref times from the inital, uniform grid that is constructed on the domain.
NOTE: The present release of AMRMHD3D does not support a restart capability in the manner of FCTMHD3D, though the inputs for it are retained. AMRMHD3D instead uses the checkpointing capability of the PARAMESH package.
The hydro input data group establishes boundary conditions for the FCT integrations of the MHD equations:
&hydro
crho1l frho1l crho1r frho1r
0.0 0.0 0.0 0.0
crho2l frho2l crho2r frho2r
0.0 0.0 0.0 0.0
crho3l frho3l crho3r frho3r
0.0 0.0 0.0 0.0
...
sbx1l sbx1r sby2l sby2r sbz3l sbz3r
0.0 0.0 0.0 0.0 0.0 0.0
iflx1l iflx1r iflx2l iflx2r iflx3l iflx3r
1 1 1 1 1 1
jflx1l jflx1r jflx2l jflx2r jflx3l jflx3r
1 1 1 1 1 1
&end
The ellipsis ... in the listing above indicates that the preceding six lines of data for the variable rho (the mass density) are to be replicated for rvx, rvy, and rvz (the x, y, and z momentum densities, respectively), then e (total energy density), p (plasma pressure), vx, vy, and vz (the x, y, and z velocities), and finally bx, by, and bz (the x, y, and z components of the magnetic field), in that order.
The values of the following variables are set by the hydro input data group:
crho1l, crho1r
Real; the constant portions of the mass density beyond the domain boundaries along the first (x) coordinate, at the left (x = x_min) and right (x = x_max), respectively.frho1l, frho1r
Real; the fractional portions of the mass density beyond the domain boundaries along the first coordinate, at the left and right, respectively.crho2l, crho2r, crho3l, crho3r
Real; the same roles as crho1l and crho1r along the second (y) and third (z) coordinates, respectively.frho2l, frho2r, frho3l, frho3r
Real; the same roles as frho1l and frho1r along the second and third coordinates, respectively.
Identical roles are played by the next eleven sets of twelve boundary values, for the remaining eleven MHD variables listed in the preceding paragraph. The rest of the input data in hydro sets the following parameters:
sbx1l, sbx1r
Real; the symmetric portions of the first (x) magnetic field component beyond the domain boundaries along the same coordinate, at the left (x = x_min) and right (x = x_max), respectively.sby2l, sby2r, sbz3l, sbz3r
Real; the same roles as sbx1l and sbx1r for the second (y) and third (z) components of the magnetic field along their coordinates, respectively.iflx1l, iflx1r
Integer; index indicating whether (iflx1_ = 1) or not (iflx1_ = 0) to apply fluxes at the domain boundaries to the flow variables in lcpfct3, for the first coordinate at the left and right, respectively.iflx2l, iflx2r, iflx3l, iflx3r
Integer; the same roles as iflx1l and iflx1r for the fluxes at the boundaries along the second and third coordinates, respectively.jflx1l, jflx1r, jflx2l, jflx2r, jflx3l, jflx3r
Integer; the same roles as iflx1l, ..., and iflx3r for the magnetic field variables in mhdfct3.
Values of the flow and field variables one cell beyond the boundaries of the computational domain are required in order to calculate the fluxes of mass, momentum, and energy at the bounding faces and the electric fields at the bounding edges. Given m1 cells subdividing the first coordinate with m1p = m1+1 interface locations, the conditions on the mass density at the domain boundaries along that direction take the form
rho( 0,j,k) = crho1l + frho1l * rho( 1,j,k)
+ ipbc1 * rho(m1,j,k),
rho(m1p,j,k) = crho1r + frho1r * rho(m1,j,k)
+ ipbc1 * rho( 1,j,k),
where j and k index the other two cell center coordinates. The three terms on the right sides of these equations correspond to constant, extrapolative, and periodic contributions to the boundary conditions at the ends of the grid. Similar expressions apply along the remaining coordinate directions, and for the rest of the flow variables. Identical relations also hold for each component of the magnetic field, along the two directions transverse to it ( e.g., for B_x along y and z). Along the coordinate aligned with the field component, however, the rule is different because the field values lie on the interfaces. In this instance, we also allow for distinct symmetric and extrapolative contributions to the boundary condition,
bx( 0,j,k) = cbx1l + fbx1l * bx( 1,j,k)
+ ipbc1 * bx(m1 ,j,k)
+ sbx1l * bx( 2,j,k),
bx(m1q,j,k) = cbx1r + fbx1r * bx(m1p,j,k)
+ ipbc1 * bx( 2,j,k)
+ sbx1r * bx(m1 ,j,k),
where m1q = m1p+1. It is crucial to ensure that every one of the boundary parameters is properly set and all are mutually consistent, or incorrect results can be expected. For example, selecting periodic boundaries does not automatically cause the remaining boundary terms to be zeroed out. The user must do this explicitly.
NOTE: The present release of AMRMHD3D supports only periodic boundary conditions. The inputs for more general boundary conditions have been retained in AMRMHD3D for use in a future release.
Finally, the start input data group sets the initialization and timestepping parameters for the code:
&start
rin rout pin pout bx0 by0 bz0
8.0 1.0 32.0 1.0 0.0 0.0 10.0
xlen ylen zlen rad
8.0 8.0 8.0 1.0
dtmin dtmax fdthyd
0.0 1.0 0.25
&end
The first two pairs of data lines fix the physical parameters for the problem initialized and solved by AMRMHD3D, in which a spherical bubble of plasma is centered in a uniform, magnetized plasma:
rin, rout
Real; the initial mass density inside and outside the bubble.pin, pout
Real; the initial pressure inside and outside the bubble.bx0, by0, bz0
Real; the initial, uniform magnetic field.xlen, ylen, zlen
Real; the extent of the domain in x, y, and z.rad
Real; the radius of the plasma bubble.dtmin, dtmax
Real; the minimum and maximum allowed time increments.fdtcfl
Real; the Courant factor for the timestep calculation.
The value for the Courant factor fdtcfl is used to fix the time increment, with the result limited by the input constraints dtmin and dtmax.
The physical diagnostics from the code are written to the formatted file amrmhd3d.prt, which is assigned to Fortran unit 16. All of the input data groups are replicated in this print file, as are the periodic diagnostics from the calls to diagno, timestep information from dtset, and the results of any restart read or write attempts by rstart and dumper. The timing diagnostics of the run appear in the formatted file amrmhd3d.inst, which is assigned to unit 99. System times and operation counts from all of the processors are combined and the results printed for the various routines called. The system times are specified in clock periods per timestep. The average performance figures are given in MFlops per processor, and the grand total for the machine is given in GFlops.
AMRMHD3D is compiled and linked by executing the following statement on the T3E:
make -f make_amrmhd3d
Two make files are included with the distribution: make_amrmhd3d, which compiles the FCTMHD3D application group of routines and links the code, and make_source, which compiles the PARAMESH amr group of routines and is called by make_amrmhd3d. The user must set the root directory for the full package in make_amrmhd3d. That root directory will contain four subdirectories:
AMRMHD3D can be modified by the user to solve a variety of problems in one or more of three places: the include file psize.h, which fixes the mesh block size for the code; the subroutine inital in the fortran file inital.f, which sets the initial conditions of the problem; and the file amrmhd3d.cnt, whose input data have already been described above. The parameters and definitions that can be reset by the user, beyond those listed previously, are the following:
In psize.h:
l1,l2,l3
Integer; the number of interior mesh points on each grid block along the 1st, 2nd, and 3rd coordinates, respectively.
m1,m2,m3
Integer; the number of total (interior plus guard-cell) mesh points on each grid block along the 1st, 2nd, and 3rd coordinates, respectively; always l1+6, l2+6, and l3+6.
The seven-cell-wide stencil of FCT requires three planes of data to be passed between grid blocks sharing an interior boundary on the grid. Thus, six additional storage locations are needed in any grid block that has other blocks to both its left and right along a particular coordinate direction. This always occurs unless the entire grid along that coordinate fits within one grid block. For efficiency, the grid blocks should be kept small, so that a refinement does not extend over an unnecessarily large region of the domain and introduce excessively many new grid points; but not so small that the needed FCT buffer of three zones to each side is larger than the block itself. The code distribution uses a compromise value of 8 for l1, l2, and l3.
In inital (amrmhd3d.f):
mhd3(:,1,:,:)
Real; the initial mass density.
mhd3(:,2-4,:,:)
Real; the initial momentum density along the 1st (2), 2nd (3), and 3rd (4) coordinates.
mhd3(:,6-8,:,:)
Real; the initial magnetic field along the 1st (6), 2nd (7), and 3rd (8) coordinates.
mhd3(:,10,:,:)
Real; the initial temperature.
The pressure and total energy density are calculated from the mass and mometum densities and magnetic field prescribed by the input data.
The data structure adopted consists of four-dimensional arrays in which the 1st, 3rd, and 4th indexes label the coordinates and the 2nd index labels the variables. This allows maximal reuse of the rather small data cache and minimizes cache misses and cache conflicts on the chip and page faults in the main memory. The result is a substantial speedup in the execution rate of the code.
This distribution of the code runs on the Cray T3E. Floprates can be found in the AMRMHD3D code performance report.
Numerics:
C. R. DeVore, Naval Research Laboratory Memorandum Report No. 6440-98-8330, December 1998.
C. R. DeVore, Journal of Computational Physics 92 (1991) 142.
C. R. DeVore, Naval Research Laboratory Memorandum Report No. 6544, September 1989.
E. S. Oran and J. P. Boris, Numerical Simulation of Reactive Flow (Elsevier, New York, 1987).
S. T. Zalesak, Journal of Computational Physics 31 (1979) 335.
J. P. Boris and D. L. Book, in Methods of Computational Physics 16, ed. J. Killeen (Academic Press, New York, 1976), p. 85.
J. P. Boris and D. L. Book, Journal of Computational Physics 20 (1976) 397.
D. L. Book, J. P. Boris, and K. Hain, Journal of Computational Physics 18 (1975) 248.
J. P. Boris and D. L. Book, Journal of Computational Physics 11 (1973) 38.
Applications:
C. R. DeVore, Astrophysical Journal (2000), submitted.
C. R. DeVore and S. K. Antiochos, Astrophysical Journal (2000), submitted.
S. K. Antiochos and C. R. DeVore, in Magnetic Helicity in Space and Laboratory Plasmas, ed. M. R. Brown, R. C. Canfield, and A. A. Pevtsov (AGU Press, Washington, 1999), p. 187.
S. K. Antiochos and C. R. DeVore, in Sun-Earth Plasma Connections, ed. J. L. Burch, R. L. Carovillano, and S. K. Antiochos (AGU Press, Washington, 1999), p. 113.
S. K. Antiochos, C. R. DeVore, and J. A. Klimchuk, Astrophysical Journal 510 (1999) 485.
J. T. Karpen, S. K. Antiochos, C. R. DeVore, and l. Golub, Astrophysical Journal 495 (1998) 491.
S. Parhi, P. DeBruyne, K. Murawski, M. Goossens, and C. R. DeVore, Solar Physics 167 (1996) 181.
J. T. Karpen, S. K. Antiochos, and C. R. DeVore, Astrophysical Journal 460 (1996) L73.
K. Murawski, C. R. DeVore, S. Parhi, and M. Goossens, Planetary and Space Sciences 44 (1996) 253.
J. T. Karpen, S. K. Antiochos, and C. R. DeVore, Astrophysical Journal 450 (1995) 422.
J. T. Karpen, S. K. Antiochos, and C. R. DeVore, Astrophysical Journal 382 (1991) 327.
J. T. Karpen, S. K. Antiochos, and C. R. DeVore, Astrophysical Journal 356 (1990) L67.
The original version of this code was written by Peter MacNeice, who added the calls to the PARAMESH amr package to the predecessor FCTMHD3D code. This contribution, and his willingness to answer many questions about the operations of and interfaces to the PARAMESH package were invaluable and are greatly appreciated.
The PARAMESH package was written by Peter MacNeice, Kevin Olson, Clark Mobarry, Rosalinda de Fainchtein, and Charles Packer at NASA/GSFC.
The FCT algorithms were devised by C. Richard DeVore. Discussions with Jay P. Boris, David L. Book, Steven T. Zalesak, John H. Gardner, and Gopal Patnaik on the computational aspects of FCT and with Judith T. Karpen and Spiro K. Antiochos on the applications of the codes are gratefully acknowledged. Thomas Clune of Cray Research provided many useful suggestions and a substantial rewrite of the lcpfct3 routine for the T3E.
This work was supported by the NASA HPCC/ESS program under Cooperative Agreement Notice 21425/041.