In situ Visualization with Task-based Parallelism

(In Proceedings of In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, Denver, CO, USA, November 12–17, 2017 (ISAV’17),

  • This short paper describes an experimental prototype of in situ visualization in a task-based parallel programming framework. A set of reusable visualization tasks were composed with an existing simulation. The visualization tasks include a local OpenGL renderer, a parallel image compositor, and a display task. These tasks were added to an existing fluid-particle-radiation simulation and weak scaling tests were run on up to 512 nodes of the Piz Daint supercomputer. Benchmarks showed that the visualization components scaled and did not reduce the simulation throughput. The compositor latency increased logarithmically with increasing node count.

Multipass Shader Partitioning by Dynamic Programming (Graphics Hardware 2005)

  • The multi pass shader partitioning problem was defined in a pair of papers by Chan et al [1,2]. The first successful solution to this problem was presented by Riffel et al [3]. The current paper presents another solution based on dynamic programming and argues that it is scalable [4]. After publication of the paper errors were found in the data presented to support this argument. Nonetheless the algorithm is semi-scalable as claimed. Dynamic Programming is commonly used for instruction selection and this problem is an instance of instruction selection. If scalability failed on this problem it would also fail on instruction selection.
[1] Proceedings of Graphics Hardware (2002)Efficient Partitioning of Fragment Shaders for Multipass Rendering on Programmable Graphics HardwareEric Chan, Ren Ng, Pradeep Sen, Kekoa Proudfoot, and Pat Hanrahan.
[2] Proceedings of Graphics Hardware (2004)Efficient Partitioning of Fragment Shaders for Multiple-Output HardwareTim Foley, Mike Houston and Pat Hanrahan.
[3] Proceedings of Graphics Hardware (2004). Mio: Fast Multipass Partitioning via Priority-Based Instruction Scheduling.  Andrew Riffel, Aaron E. Lefohn, Kiril Vidimce, Mark Leone, and John D. Owens.
[4] Proceedings of Graphics Hardware (2005).  Optimal Automatic Multi-pass Shader Partitioning by Dynamic Programming.  A. Heirich.  Click here for slides of the presentation.


Competitive Analysis of Load Balancing Strategies for Parallel Ray Tracing, with James Arvo [5].

  • This paper gives a fundamental formula for predicting workload imbalance as a result of static load balancing strategies like tiling and randomization in parallel ray tracing. The results can equally be applied to any parallel algorithm for graphics or image processing. It predicts that these strategies will fail at large numbers of computers, and for NTSC resolution images this was true at 128-way parallelism and above. The solution to this failure is dynamic load balancing such as the diffusion strategy in [6,7].

[5] The Journal of SupercomputingA Competitive Analysis of Load Balancing Strategies for Parallel Ray TracingA. Heirich and J. Arvo, vol. 12, no. 1/2, pp. 57-68 (1998).
[6] The International Journal for Foundations of Computer Science (1997).  A Scalable Diffusion Algorithm for Dynamic Mapping and Load Balancing on Networks of Arbitrary Topology.  A. Heirich, vol. 8, no. 3, September 1997, pp. 329-346.
[7] In proceedings of the International Conference on Parallel Processing (1995).  A Parabolic Load Balancing Method.  A. Heirich and S. Taylor, vol. III, pp. 192-202.  Winner of "outstanding paper of the year". With thanks to Andrew Conley.
Analysis of scalable algorithms for dynamic load balancing and mapping with application to photo-realistic rendering. (Dissertation)


Movie: Scalable Interactive Volume Rendering

Scalable Graphics Cluster Supercomputer Architecture

  • These papers describe work at HP/Compaq to build a commodity based scalable graphics architecture. The results include world record setting performance and scalability on volume rendering [10] and a commercial product the HP Scalable Visualization Array. A similar project was developed by Stoll et al [13] called Lightning-2. Although it was intended as a sort-last architecture Lightning-2 was not scalable and could not support volume rendering or applications that require ordered blending.

[9] Parallel Computing (2003).  Distributed Rendering of Interactive Soft ShadowsM. Isard, M. Shand and A. Heirich.  Parallel Computing, vol. 29, no. 3, March 2003, pp. 311-323.
[10] IEEE Visualization 2002.  Workshop on commodity-based visualization clusters (presentation October 27, 2002). 
Alpha/Depth Acquisition Through DVIA. Heirich, M. Shand, E. Oertli, G. Lupton and P. Ezolt.
[11] IEEE Parallel and Large-Data Visualization and Graphics Symposium (2001).  Scalable Interactive Volume Rendering Using Off-the-Shelf Components.  S. Lombeyda, L. Moll, M. Shand, D. Breen and A. Heirich.
[12] IEEE Parallel Visualization and Graphics Symposium (1999).  Scalable Distributed Visualization Using Off-the-Shelf Components.  A. Heirich and L. Moll.
[13] IEEE Symposium on Field Programmable Custom Computing Machines (1999).  Sepia: Scalable 3D Compositing Using PCI PametteL. Moll, A. Heirich, and M. Shand.
[14] In Proceedings of ACM SIGGRAPH (2001).
Lightning-2: a high-performance display subsystem for PC clusters. G. Stoll, M. Eldridge, D. Patterson, A. Webb, S. Berman, R. Levy, C. Caywood, M. Taveira, S. Hunt and P. Hanrahan.
Scalability in large-data scientific visualization. A. Heirich. Pittsburgh Supercomputing Center (2002).

siggraph_cover dandelion

Interactive soft shadows

  • with Maneesh Agrawala, Ravi Ramamoorthi and Laurent Moll.

[16] Proceedings of ACM SIGGRAPH (2000).  Efficient image-based methods for rendering soft shadowsM. Agrawala, R. Ramamoorthi, L. Moll and A. Heirich.

Neuronal Simulations

[17] In Connectionist Models: Proceedings of the 1990 Summer School (Eds. Touretzky, Elman, Sejnowski & Hinton, pp. 369-374). Neuronal Signal Strength Is Enhanced By Rhythmic Firing. A. Heirich & C. Koch.