A Survey of Parallel Algorithms for Shared-Memory Machines
Karp, Richard M.
Technical Report Identifier: CSD-88-408
Abstract: This paper is a survey of the growing body of theory concerned with parallel algorithms and the complexity of parallel computation. The principal model of computation that we consider is the parallel random-access machine (PRAM), in which it is assumed that each processor has random access in unit time to any cell of a global memory. This model permits the logical structure of parallel computation to be studied in a context divorced from issues of interprocessor communication.
Section 2 surveys efficient parallel algorithms for bookkeeping operations such as compacting an array by squeezing out its "dead" elements, for evaluating algebraic expressions, for searching a graph and decomposing it into various kinds of components, and for sorting, merging and selection. These algorithms are typically completely different from the best sequential algorithms for the same problems, and their discovery has required the creation of a new set of paradigms for the construction of parallel algorithms.
Section 3 studies the relationships among several variants of the PRAM model which differ in their implementation of concurrent reading and/or concurrent writing, presents lower bounds on the time to solve certain elementary problems on various kinds of PRAMs, and compares the PRAM with other models such as bounded-fan-in and unbounded-fan-in circuits, alternating Turing machines and vector machines.
Section 4 discusses NC, a hierarchy of problems solvable by deterministic algorithms that operate in polylog time using a polynomial-bounded number of processors. Among the problems shown to lie at low levels within this hierarchy are the basic arithmetic operations, transitive closure and Boolean matrix multiplication, the computation of the determinant, the rank and the inverse of a matrix, and the evaluation of certain classes of straight-line programs. Section 4 also introduces the randomized version of NC, and gives fast randomized parallel algorithms for problems such as finding a maximum matching or a maximal independent set of vertices in a graph. Section 4 concludes by exhibiting several problems that are complete in the sequential complexity class P with respect to logspace reducibility, and hence unlikely to lie in NC.