wiki:Tutorials/JobSizes

Version 14 (modified by Erica Kaminski, 11 years ago) ( diff )

Allowable Job Sizes

Introduction

Job Size (RAM)

Understanding the feasibility of the desired size of a simulation is important. Doing so allows the user to avoid overly long simulations, avoid memory allocation errors, and/or perhaps allows her to rework the simulation so that it becomes feasible. Options there include reducing the size of the domain, reducing the number of tracer variables, or even casting the simulation into a different reference frame, such as a shock front rest frame.

It is more complicated to calculate the expected cost for an AMR simulation than fixed-grid, so we discuss fixed-grid first.

Wikipedia can introduce you to the concept of double precision, which is how AstroBEAR represents its state variables (density, momentum, energy).

Each double precision number is represented in 8 bytes. Thus, you can easily calculate how much memory Mtot a fixed-grid simulation requires by simply adding up all the numbers,

where mx,my,mz are the number of zones in each direction, Nqvars is the number of variables in the q-array (that is, the state variables of the fluid), and Nauxvars is the number of "auxillary" variables, that is variables initialized for MHD simulations. The q-array contains the cell-centered quantities, and the aux-array contains face and edge centered quantities. Each cell contains a q-array (plus an aux-array if in MHD), therefore the total number of cells times the total variables times 8 bytes gives you the total memory you are expected to allocate for the physical domain (not including the ghost cells or any other redundant calculations). We find that the total memory calculated this way is usually off by a factor of 10 from what astrobear actually uses (as given in the standard out, on the global info allocations line). This is likely due to the computation of ghost cells plus any other redundant calculations. Therefore for all memory estimates, you should multiply your number by 10 to get a truer estimate.

For hydro simulations, Nauxvars = 0, so only need to count up the total Nqvars. In 2D there are 4 hydro q variables: density, x-momentum, y-momentum, energy. There are also 4 in 2.5D. In 3D, there are 5: density, x-momentum, y-momentum, z-momentum, and energy. There are also 5 in 2.5D with angular momentum.

The situation is a little different for MHD simulations, where now each component of the momenta is coupled to the equations. Therefore, Nqvars now contains all 8 state variables regardless of the dimension of the simulation: density, 3-momenta, E, Bx, By, Bz. In addition to these 8 q-variables, there are additional variables initialized in MHD, these are called the auxiliary variables. These include the B-fields on the faces of the cells as well as the EMFs from the edges of the cells. We also store parent-child EMFs. Therefore in 2D, Nauxvars = 5, which are: Bx, By, Ez, Ez_child, and Ez_parent. In 3D, Nauxvars = 12 , which are: Bx, By, Bz, Ex, Ey, Ez, Ex_child, Ey_child, Ez_child, and Ex_parent, Ey_parent, and Ez_parent.

For both hydro and MHD simulations, you must also add to the total number of variables, the total number of tracers you are using, as this will be initialized in the q-array.

Luckily, AstroBEAR takes advantage of distributed memory, meaning that the more compute nodes which are allocated, the more memory it can access. In practice, this is the default thing to do if a simulation is very large.

Examples of RAM Requirements by Problem Size

Below I have bolded three typical sizes to aid the eye.

Number of variables 1 (1D advection) 4 (2D hydro, min.) 5 (3D hydro, min.) 8 (MHD, min.) 13 (2D MHD+aux) 20 (3D MHD+aux)
162 : 63 2 KB 8 KB 10 KB 16 KB 26 KB 40 KB
322 : 103 8 KB 32 KB 40 KB 64 KB 104 KB 160 KB
642 : 163 32 KB 128 KB 160 KB 256 KB 416 KB 640 KB
1282 : 253 128 KB 512 KB 640 KB 1024 KB 1.63 MB 2.5 MB
2562 : 403 512 KB 2 MB 2.5 MB 4 MB 6.5 MB 10 MB
5122 : 643 2 MB 8 MB 10 MB 16 MB 26 MB 40 MB
1,0242 : 1023 8 MB 32 MB 40 MB 64 MB 104 MB 160 MB
2,0482 : 1613 32 MB 128 MB 160 MB 256 MB 416 MB 640 MB
4,0962 : 2563 128 MB 512 MB 640 MB 1024 MB 1.63 GB 2.5 GB
8,1922 : 4063 512 MB 2 GB 2.5 GB 4 GB 6.5 GB 10 GB
16,3842 : 6453 2 GB 8 GB 10 GB 16 GB 26 GB 40 GB
32,7682 : 1,0243 8 GB 32 GB 40 GB 64 GB 104 GB 160 GB
65,5362 : 1,6253 32 GB 128 GB 160 GB 256 GB 416 GB 640 GB
131,0722 : 2,5803 128 GB 512 GB 640 GB 1024 GB 1.63 GB 2.5 GB
262,1442 : 4,0963 512 GB 2 GB 2.5 GB 4 GB 6.5 GB 10 GB
524,2882 : 6,5023 2 TB 8 GB 10 GB 16 GB 26 GB 40 GB
1,048,5762 : 10,3213 8 TB 32 GB 40 GB 64 GB 104 GB 160 GB
2,097,1522 : 16,3843 32 TB 128 GB 160 GB 256 GB 416 GB 640 GB
4,194,3042 : 26,0083 128 TB 512 GB 640 GB 1024 GB 1.63 PB 2.5 PB
8,388,6082 : 41,2853 512 TB 2 PB 2.5 PB 4 PB 6.5 PB 10 PB
16,777,2162 : 65,5363 2 PB 8 PB 10 PB 16 PB 26 PB 40 PB

Job Length (Time)

The length of time a simulation will take to run is based on its dynamics. In particular, when a user specifies an end time they are implicitly specifying a certain number of crossing times, or number of times that information could cross the domain.

In the simple case of hydro with no elliptic components, the crossing time is a function of the maximum acoustic wave speed vcs, max in the grid, where

where p is the pressure, gamma is the adiabatic index (ratio of specific heats), and rho the mass density.

This maximum speed—and its equivalents in MHD or with elliptic terms—determines the largest timestep that the code can take, bounded by the CFL number, C,

Thus, one can estimate the number of timesteps needed to reach the end of the simulation, via

meaning that the wall-clock time to finish the simulation is then approximately

where dTnow is the wall-clock time it took to finish the present timestep.


In practice, you're best off guessing

Unfortunately, in reality the maximum wave speed may vary wildly over the course of a simulation, meaning such approximations should be taken with a grain of salt.

In practice, you can quickly get an idea of how long your simulation will take by looking at the time between data output frames. In most cases, there will be many individual time steps in between data frames, meaning that short-term differences between the length of time steps will be smoothed out. This allows a fairly good indicator of whether the time between frames is expected to decrease, increase, or stay the same.

Maximum Job Sizes by Machine, Fixed-Grid

RAM is most often given in GB.

Machine cores/node RAM/node RAM/core (max.) RAM/core (min.)
Bluehive, general queue 8 16GB 16GB 2GB
Bluehive, Afrank queuea 8 12GB/8GB 12GB/8GB 1.5GB/1GB
Bluegeneb 64 128GB 2GB 0.5GB

Notes: a) As of 11/2010, there are 6 afrank nodes: 3 with 12GB RAM and 2 with 8GB RAM. b) bg/p has actually 2GB RAM quad-core nodes, but you can request nodes only in packs of 64, so you can treat those 64 individual nodes as a single node with 64*2GB=128GB RAM. You can, however, specify 1—4 of those cores per node, giving you 2GB RAM/core (1 core/node) or 0.5GB RAM/core (4 cores/node).

Maximum Job Sizes on Bluehive

While one could take the above table and compute theoretical upper limits, in practice there will be other requirements on RAM. The table below therefore includes a factor of 80% of n_cells to reflect this.

In practice, with AstroBEAR the limits are substantially less than even those given here.

Number of variables 1 (1D advection) 4 (2D hydro, min.) 5 (3D hydro, min.) 8 (MHD, min.) 13 (2D MHD+aux) 20 (3D MHD+aux)
1-node (8 proc) 11983 7543 7003 5993 5093 4413
4-node (32 proc) 19013 11983 11123 9513 8093 7003
8-node (64 proc) 23953 15093 14013 11983 10193 8823

Maximum Job Sizes on Bluegene/p

This table also includes the 80% factor.

Number of variables 1 (1D advection) 4 (2D hydro, min.) 5 (3D hydro, min.) 8 (MHD, min.) 13 (2D MHD+aux) 20 (3D MHD+aux)
1-"node" (64 nodes/256 procs) 23953 15093 14013 11983 10193 8823
2-"node" 30183 19013 17653 15093 12843 11123
8-"node" (half-machine) 47913 30183 28023 23953 20373 17653
16-"node" (all-machine) 60363 38023 35303 30183 25673 22243


Maximum Job Sizes by Machine, AMR

The case with AMR is complicated by the fact that multiple grids on different levels may represent the same phsyical space in the domain. One may not therefore know ahead of time exactly how many cells will exist in the simulation at a certain time.

(table coming soon?)


Maximum Job Length by Machine

See the CRC's (or your own institution's) webpages for the maximum allowable job lengths by machine. In cases where the simulation is expected to run longer than this, the user must of course ensure that at least one data frame can be written in this time.

Ideally, the user would set up data frames so that the last ones are created only shortly before the job is killed (so as to not waste computation time).

Note: See TracWiki for help on using the wiki.