Repository landing page

We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.

An MPI-based 2D data redistribution library for dense arrays

Abstract

In HPC, data redistributions (reorganizations) are used in parallel applications to improve performance and/or provide data-locality compatibility with sequences of parallel operations. Data reorganization refers to changing the logical and physical arrangement of data (such as dense arrays or matrices distributed over peer processes in a parallel program). This operation can be achieved by applying transformations such as transpositions or rotations or changing how data is mapped across the process grid P by Q, all of which are accomplished either with message passing or distributed shared memory operations. In this project, we restrict ourselves to a distributed memory model and message passing, not a shared-memory model, nor do we use distributed shared memory APIs. Our primary goal is to generate a library capable of diverse data reorganizations. We aim to develop a high-level Application Programming Interface (API) that directly works with the Message Passing Interface (MPI) to accomplish data redistributions in data-parallel applications and libraries, such as the polymath library, a library of many algorithms all of which implement parallel dense matrix multiplication algorithms with flexible data layouts. Using the reorganization mode of process grid shapes with constant total processes, we plan to observe the performance trends of the polymath dense parallel matrix-multiplication library (which is related research) based on different grid shapes, problem sizes, numbers of processes and decide whether it is more efficient to redistribute data or not to achieve the highest performance. We will test other redistributions for some of the process shapes used with the polymath library to identify how redistribution impacts performance. These tests will provide us with the information to determine if redistribution is worthwhile for non-optimal process layout (as established with the polymath system). Besides changing the shape of the data in terms of its layout on a logical process topology, we also plan to study and demonstrate data transpose algorithms in this library, another useful redistribution mechanism for dense linear algebra in distributed-memory parallel computing

Similar works

This paper was published in UTC Scholar.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.