Search
Now showing items 1-10 of 18
Work-First and Help-First Scheduling Policies for Terminally Strict Parallel Programs
(2008-11-13)
Multiple programming models are emerging to address an increased need for dynamic task parallelism in applications for multicore processors and shared-address space parallel computing. Examples include OpenMP 3.0, Java Concurrency Utilities, Microsoft Task Parallel Library, Intel Thread Building Blocks, Cilk, X10, Chapel, and Fortress. Scheduling ...
The Platform-Aware Compilation Environment: Status and Future Directions
(2012-06-13)
The Platform-Aware Compilation Environment (PACE) is an ambitious attempt to construct a portable compiler that produces code capable of achieving high levels of performance on new architectures. The key strategies in PACE are the design and development of an optimizer and runtime system that are parameterized by system characteristics, the automatic ...
The Concurrent Collections Programming Model
(2010-01-04)
We introduce the Concurrent Collections (CnC) programming model. In this model, programs are written in terms of high-level operations. These operations are partially ordered according to only their semantic constraints. These partial orderings correspond to data dependences and control dependences. The role of the domain expert, whose interest and ...
Register Allocation using Bipartite Liveness Graphs
(2010-10-12)
Register allocation is an essential optimization for all compilers. A number of sophisticated register allocation algorithms have been developed based on Graph Coloring (GC) over the years. However, these algorithms pose three major limitations in practice. First, construction of a full interference graph can be a major source of space and time ...
User-Specified and Automatic Data Layout Selection for Portable Performance
(2013-04-25)
This paper describes a new approach to managing array data layouts to optimize performance for scientific codes. Prior research has shown that changing data layouts (e.g., interleaving arrays) can improve performance. However, there have been two major reasons why such optimizations are not widely used: (1) the need to select different layouts for ...
Communication Optimizations for Distributed-Memory X10 Programs
(2010-04-10)
X10 is a new object-oriented PGAS (Partitioned Global Address Space) programming language with support for distributed asynchronous dynamic parallelism that goes beyond past SPMD message-passing models such as MPI and SPMD PGAS models such as UPC and Co-Array Fortran. The concurrency constructs in X10 make it possible to express complex computation ...
Compiler Support for Work-Stealing Parallel Runtime Systems
(2010-03-03)
Multiple programming models are emerging to address an increased need for dynamic task parallelism in multicore shared-memory multiprocessors. Examples include OpenMP 3.0, Java Concurrency Utilities, Microsoft Task Parallel Library, Intel Threading Building Blocks, Cilk, X10, Chapel, and Fortress. Scheduling algorithms based on work-stealing, as ...
Automatic Detection of Inter-application Permission Leaks in Android Applications
(2013-01-23)
Due to their growing prevalence, smartphones can access an increasing amount of sensitive user information. To better protect this information, modern mobile operating systems provide permission-based security, which restricts applications to only access a clearly defined subset of system APIs and user data. The Android operating system builds upon ...
Interprocedural Strength Reduction of Critical Sections in Explicitly-Parallel Programs
(2013-05-01)
In this paper, we introduce novel compiler optimization techniques to reduce the number of operations performed in critical sections that occur in explicitly-parallel programs. Specifically, we focus on three code transformations: 1) Partial Strength Reduction (PSR) of critical sections to replace critical sections by non-critical sections on certain ...
The Concurrent Collections Programming Model
(2010-12-16)
Parallel computing has become firmly established since the 1980’s as the primary means of achieving high performance from supercomputers. 1 Concurrent Collections (CnC) was developed to address the need for making parallel programming accessible to non-professional programmers. One approach that has historically addressed this problem is the creation ...