Version 8 (modified by 11 years ago) ( diff ) | ,
---|
Allowable Job Sizes
Introduction
Job Size (RAM)
Understanding the feasibility of the desired size of a simulation is important. Doing so allows the user to avoid overly long simulations, avoid memory allocation errors, and/or perhaps allows her to rework the simulation so that it becomes feasible. Options there include reducing the size of the domain, reducing the number of tracer variables, or even casting the simulation into a different reference frame, such as a shock front rest frame.
It is more complicated to calculate the expected cost for an AMR simulation than fixed-grid, so we discuss fixed-grid first.
Wikipedia can introduce you to the concept of double precision, which is how AstroBEAR represents its state variables (density, momentum, energy).
Each double precision number is represented in 8 bytes. Thus, you can easily calculate how much memory M_{tot} a fixed-grid simulation requires by simply adding up all the numbers,
where N_{auxvars} is nonzero only with MHD, in 2D (5 extra variables) or 3D (12 extra). N_{qvars} represents the state variables—in hydro there are 4 in 2D & 2.5D and 5 in both 2.5D+angular momentum and 3D; in MHD there are always 8 plus 5 or 12 aux fields in 2D & 3D, respectively—as well as any tracer variables.
Luckily, AstroBEAR takes advantage of distributed memory, meaning that the more compute nodes which are allocated, the more memory it can access. In practice, this is the default thing to do if a simulation is very large.
Examples of RAM Requirements by Problem Size
Below I have bolded three typical sizes to aid the eye.
Number of variables | 1 (1D advection) | 4 (2D hydro, min.) | 5 (3D hydro, min.) | 8 (MHD, min.) | 13 (2D MHD+aux) | 20 (3D MHD+aux) |
16^{2} : 6^{3} | 2 KB | 8 KB | 10 KB | 16 KB | 26 KB | 40 KB |
32^{2} : 10^{3} | 8 KB | 32 KB | 40 KB | 64 KB | 104 KB | 160 KB |
64^{2} : 16^{3} | 32 KB | 128 KB | 160 KB | 256 KB | 416 KB | 640 KB |
128^{2} : 25^{3} | 128 KB | 512 KB | 640 KB | 1024 KB | 1.63 MB | 2.5 MB |
256^{2} : 40^{3} | 512 KB | 2 MB | 2.5 MB | 4 MB | 6.5 MB | 10 MB |
512^{2} : 64^{3} | 2 MB | 8 MB | 10 MB | 16 MB | 26 MB | 40 MB |
1,024^{2} : 102^{3} | 8 MB | 32 MB | 40 MB | 64 MB | 104 MB | 160 MB |
2,048^{2} : 161^{3} | 32 MB | 128 MB | 160 MB | 256 MB | 416 MB | 640 MB |
4,096^{2} : 256^{3} | 128 MB | 512 MB | 640 MB | 1024 MB | 1.63 GB | 2.5 GB |
8,192^{2} : 406^{3} | 512 MB | 2 GB | 2.5 GB | 4 GB | 6.5 GB | 10 GB |
16,384^{2} : 645^{3} | 2 GB | 8 GB | 10 GB | 16 GB | 26 GB | 40 GB |
32,768^{2} : 1,024^{3} | 8 GB | 32 GB | 40 GB | 64 GB | 104 GB | 160 GB |
65,536^{2} : 1,625^{3} | 32 GB | 128 GB | 160 GB | 256 GB | 416 GB | 640 GB |
131,072^{2} : 2,580^{3} | 128 GB | 512 GB | 640 GB | 1024 GB | 1.63 GB | 2.5 GB |
262,144^{2} : 4,096^{3} | 512 GB | 2 GB | 2.5 GB | 4 GB | 6.5 GB | 10 GB |
524,288^{2} : 6,502^{3} | 2 TB | 8 GB | 10 GB | 16 GB | 26 GB | 40 GB |
1,048,576^{2} : 10,321^{3} | 8 TB | 32 GB | 40 GB | 64 GB | 104 GB | 160 GB |
2,097,152^{2} : 16,384^{3} | 32 TB | 128 GB | 160 GB | 256 GB | 416 GB | 640 GB |
4,194,304^{2} : 26,008^{3} | 128 TB | 512 GB | 640 GB | 1024 GB | 1.63 PB | 2.5 PB |
8,388,608^{2} : 41,285^{3} | 512 TB | 2 PB | 2.5 PB | 4 PB | 6.5 PB | 10 PB |
16,777,216^{2} : 65,536^{3} | 2 PB | 8 PB | 10 PB | 16 PB | 26 PB | 40 PB |
Job Length (Time)
The length of time a simulation will take to run is based on its dynamics. In particular, when a user specifies an end time they are implicitly specifying a certain number of crossing times, or number of times that information could cross the domain.
In the simple case of hydro with no elliptic components, the crossing time is a function of the maximum acoustic wave speed v_{cs, max} in the grid, where
where p is the pressure, gamma is the adiabatic index (ratio of specific heats), and rho the mass density.
This maximum speed—and its equivalents in MHD or with elliptic terms—determines the largest timestep that the code can take, bounded by the CFL number, C,
Thus, one can estimate the number of timesteps needed to reach the end of the simulation, via
meaning that the wall-clock time to finish the simulation is then approximately
where dT_{now} is the wall-clock time it took to finish the present timestep.
In practice, you're best off guessing
Unfortunately, in reality the maximum wave speed may vary wildly over the course of a simulation, meaning such approximations should be taken with a grain of salt.
In practice, you can quickly get an idea of how long your simulation will take by looking at the time between data output frames. In most cases, there will be many individual time steps in between data frames, meaning that short-term differences between the length of time steps will be smoothed out. This allows a fairly good indicator of whether the time between frames is expected to decrease, increase, or stay the same.
Maximum Job Sizes by Machine, Fixed-Grid
RAM is most often given in GB.
Machine | cores/node | RAM/node | RAM/core (max.) | RAM/core (min.) |
---|---|---|---|---|
Bluehive, general queue | 8 | 16GB | 16GB | 2GB |
Bluehive, Afrank queue^{a} | 8 | 12GB/8GB | 12GB/8GB | 1.5GB/1GB |
Bluegene^{b} | 64 | 128GB | 2GB | 0.5GB |
Notes: ^{a)} As of 11/2010, there are 6 afrank nodes: 3 with 12GB RAM and 2 with 8GB RAM. ^{b)} bg/p has actually 2GB RAM quad-core nodes, but you can request nodes only in packs of 64, so you can treat those 64 individual nodes as a single node with 64*2GB=128GB RAM. You can, however, specify 1—4 of those cores per node, giving you 2GB RAM/core (1 core/node) or 0.5GB RAM/core (4 cores/node).
Maximum Job Sizes on Bluehive
While one could take the above table and compute theoretical upper limits, in practice there will be other requirements on RAM. The table below therefore includes a factor of 80% of n_cells to reflect this.
In practice, with AstroBEAR the limits are substantially less than even those given here.
Number of variables | 1 (1D advection) | 4 (2D hydro, min.) | 5 (3D hydro, min.) | 8 (MHD, min.) | 13 (2D MHD+aux) | 20 (3D MHD+aux) |
1-node (8 proc) | 1198^{3} | 754^{3} | 700^{3} | 599^{3} | 509^{3} | 441^{3} |
4-node (32 proc) | 1901^{3} | 1198^{3} | 1112^{3} | 951^{3} | 809^{3} | 700^{3} |
8-node (64 proc) | 2395^{3} | 1509^{3} | 1401^{3} | 1198^{3} | 1019^{3} | 882^{3} |
Maximum Job Sizes on Bluegene/p
This table also includes the 80% factor.
Number of variables | 1 (1D advection) | 4 (2D hydro, min.) | 5 (3D hydro, min.) | 8 (MHD, min.) | 13 (2D MHD+aux) | 20 (3D MHD+aux) |
1-"node" (64 nodes/256 procs) | 2395^{3} | 1509^{3} | 1401^{3} | 1198^{3} | 1019^{3} | 882^{3} |
2-"node" | 3018^{3} | 1901^{3} | 1765^{3} | 1509^{3} | 1284^{3} | 1112^{3} |
8-"node" (half-machine) | 4791^{3} | 3018^{3} | 2802^{3} | 2395^{3} | 2037^{3} | 1765^{3} |
16-"node" (all-machine) | 6036^{3} | 3802^{3} | 3530^{3} | 3018^{3} | 2567^{3} | 2224^{3} |
Maximum Job Sizes by Machine, AMR
The case with AMR is complicated by the fact that multiple grids on different levels may represent the same phsyical space in the domain. One may not therefore know ahead of time exactly how many cells will exist in the simulation at a certain time.
(table coming soon?)
Maximum Job Length by Machine
See the CRC's (or your own institution's) webpages for the maximum allowable job lengths by machine. In cases where the simulation is expected to run longer than this, the user must of course ensure that at least one data frame can be written in this time.
Ideally, the user would set up data frames so that the last ones are created only shortly before the job is killed (so as to not waste computation time).