Changes between Version 25 and Version 26 of Scrambler


Ignore:
Timestamp:
05/17/11 10:47:54 (14 years ago)
Author:
Brandon Shroyer
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Scrambler

    v25 v26  
    44The growing size of scientific simulations can no longer be accommodated simply by increasing the number of nodes in a cluster.  Completing larger jobs without increasing the wall time requires a decrease in the workload per processor (i.e., increased parallelism).  Unfortunately, increased parallelism often leads to increased communication time.  Minimizing the cost of this communication requires efficient parallel algorithms to manage the distributed AMR structure and calculations.
    55
    6 AstroBEAR's strength lies in its distributed tree structure.  Many AMR codes replicate the entire AMR tree on each computational node.  This approach incurs a heavy communication cost as the tree continuously broadcasts structural changes to all processors.  AstroBEAR, on the other hand, only keeps as much tree information as the local grids need to communicate with processors containing nearby grid regions.  This approach saves us memory usage as well as communication time, leaving us well-positioned to take advantage of low-memory architectures such as !BlueGene systems and GPUs.
     6AstroBEAR's strength lies in its distributed tree structure.  Many AMR codes replicate the entire AMR tree on each computational node.  This approach incurs a heavy communication cost as the tree continuously broadcasts structural changes to all processors.  AstroBEAR, on the other hand, only keeps as much tree information as the local grids need to communicate with processors containing nearby grid regions.  This approach saves us memory usage as well as communication time, leaving us well-positioned to take advantage of massively parallel low-memory architectures such as [http://en.wikipedia.org/wiki/Blue_Gene BlueGene] systems and [http://www.nvidia.com/object/GPU_Computing.html GPUs].
    77
    8 In our distributed tree system, processors can have children and parents in much the same way that grids do.  All of a processor's tree information--grids, overlaps, neighbors, and processors--comes from the parent processor.  Thus, each processor only needs to communicate with its parent processor and the processors whose data directly interact with it.
     8In our distributed tree system, processors can have children and parents in much the same way that grids do.  All of a processor's tree information--grids, overlaps, neighbors, and processors--comes from the parent processor.  Thus, each processor only needs to communicate with its parent processor and the processors whose data directly interact with it.  Grids are distributed among processors via a [http://en.wikipedia.org/wiki/Hilbert_curve Hilbert ordering], allowing AstroBEAR to take full advantage of the cluster's topology.
    99
    1010Under this new system, a processor has four classes of processors with which it interacts:
     
    1414 * ''Overlap'': As an AMR simulation evolves, grids are constantly being created and destroyed as the refined regions change.  An overlap processor is associated with grids from the previous step that overlap the local processor's current grids.
    1515
    16 
     16Workload is managed on AstroBEAR through careful work scheduling and distributed load balance calculations.  AstroBEAR calculates a given level's workload for each processor based on the total workload for the level and the processor's remaining workload from the previous level.  This approach allows us to balance the workload globally rather than level by level.  As a result, AstroBEAR can handle simulations with extremely coarse base grids and many levels of AMR.
    1717
    1818{{{