Version 21 (modified by 10 years ago) ( diff ) | ,
---|
Scaling on Bluestreak
Scaling on Bluestreak
Intro:
All tests done with Shear15, ~¼ way through simulation. I am assuming that the other runs will scale similarly. Astrobear was built with hypre library 2.9.0 without global partition.
Method:
In global.data, changed number of frames and restart frame by factor of 10. So final frame went from 200 to 2000, and restart frame was changed from chombo00050.hdf to chombo00500.hdf. Running this simulation then produces frames with dt = 1/10th of original frame. From this I can estimate the actual run time by multiplying the run time I get for a 1/10th dt frame by 10. This allows for faster scaling tests.
Results:
Nodes | Start time | Frame times | Average run time (not including first frame) | Avg. time to write to file | Frames/hr |
32 | 3:09 | 3:43, 4:15, 4:49 | 33 min/ .1 frame | 1.9 min | .19 |
128 | 3:09 | 3:29, 3:43, 3:57, 4:12 | 14.3 min/ .1 frame | 3.7 min | .55 |
256 | 3:14 | 3:29, 3:42, 3:55, 4:09 | 13.3 min/ .1 frame | 6.8 min | .83 |
512 | 5:17 | 5:34, 5:47, 6:00, 6.13 | 13 min/ .1 frame | 9.4 min | 1.3 |
In the table, I do not include the first frame when averaging the run time as it seems unusually slow given the additional time to reload the grids upon restart.
Note!!! As I increase the number of nodes, the time to write to file increases as well. Therefore, to get a truer estimate of the run-time for one frame, we need to be careful to remove the write-time before multiplying by 10, and only add it after (see next section).
Frames / hour calculation:
Start with calculating the framerate R. Since my data is given in minutes, R naturally has units of mins/frame. Remember to subtract off the write-time before multiplying by 10, since we are interested in the computation time only for a full time step. We can add the write-time back afterward:
So now we have a framerate in minutes/(full) frame. To convert to hours/frame, divide by 60, and to get to frames/hour, invert R.
32 Nodes Example
An example of how to do this for the 32 node case is as follows,
To get frames/hour, take the inverse R-1. Note — this assumes the write time will be the same at the end of the normal dt time step.
Similarly,
Note the scaling inefficiency — doubling the processors does not cut the run-time in half.
Choosing the best set of runs:
The question is, which combination gets me the most frames per hour? With 3 simulations on 512 nodes, the viable options are 1) 1 job with 512 nodes, 2) 2 jobs with 256 n each, 3) 2 jobs @ 128 n, and a 3rd @ 256. Adding up the total frames per hour for the different options shows that option 3) yields the most frames per hour.
Colliding Flows Run - Hydro, Shear Angle 15
- Extremely poor scaling on Bluestreak
- Memory errors with more nodes (remedied with different hypre library)
- Delicate balance between speeding up sims with more nodes, and encountering memory issues that kill the sims (remedied with different hypre library)
Nodes | Frames | Time to make 1 frame | Notes |
32 | 32-39 | 3 hours | wall-time ran out |
256 | 39-48 | 1 hours | memory died 48.4 |
512 | 48-49 | 1 hours | memory died 49.5 |
32 | 49-50 | 4 hours | walltime ran out |
As the number of nodes increases, more patches are made, leading to more ghost zones. Thus, the global info as reported in standard out (that includes physical and ghost zones) increases with nodes.
Peak goes down, as the amount of info is distributed over more and more cores.
Percent efficiency as given in the standard out.
The time to write to file increases with nodes. This can affect scaling computation (see http://astrobear.pas.rochester.edu/trac/astrobear/wiki/u/erica/ScalingBluestreak for details).
The memory error is given in 2 places: 1) end of astrobear.log, and 2) a 'core' file that is written to the run directory. I am attaching as an example, the memory error reports for Shear15, nodes 256.
Attachments (4)
- chart_1(1).png (8.4 KB ) - added by 10 years ago.
- chart_2(1).png (8.5 KB ) - added by 10 years ago.
- chart_3(1).png (8.4 KB ) - added by 10 years ago.
- chart_4(1).png (7.9 KB ) - added by 10 years ago.
Download all attachments as: .zip