Efficient and Reliable Algorithms for Challenging Matrix Computations targeting Multicore Architectures and Massive Parallelism

Along with the evolution towards massively parallel HPC systems with multicore nodes, there is an immense demand of new and improved scalable, efficient, and reliable numerical algorithms and library software for fundamental and challenging matrix computations. Such algorithms and software are used as building blocks for solving current and future very large-scale computational problems in science and engineering.

Recently, the Umea research group has presented several novel results concerning challenging matrix computations, including (1) Parallel and cache-efficient in-place matrix storage format conversion; (2) Parallel two-stage reduction to Hessenberg form using shared memory; (3) Parallel QR and QZ multishift algorithms with advanced deflation strategies; (4) Parallel eigenvalue reordering in real Schur forms; (5) The SCASY library - parallel solvers for Sylvester-type matrix equations with applications in condition estimation.

Topic (1) concerns techniques and algorithms for efficient in-place conversion between standard and blocked matrix storage formats. Such functionality enables numerical libraries to use various data layouts internally for matching blocked algorithms and data structures to a memory hierarchy. Topics (2) - (5) concern the solution of large-scale dense eigenvalue problems and matrix equations via two-sided transformation methods and provide new functionality for scalable HPC computations. In this presentation, we will review and highlight some of our most recent results concerning the topics (1) - (5).