Now showing items 1-10 of 18
Efficient Selection of Vector Instructions using Dynamic Programming
Accelerating program performance via SIMD vector units is very common in modern processors, as evidenced by the use of SSE, MMX, VSE, and VSX SIMD instructions in multimedia, scientific, and embedded applications. To take ...
A Hierarchical Region-Based Static Single Assignment Form
Modern compilation systems face the challenge of incrementally reanalyzing a program’s intermediate representation each time a code transformation is performed. Current approaches typically either re-analyze the entire ...
User-Specified and Automatic Data Layout Selection for Portable Performance
This paper describes a new approach to managing array data layouts to optimize performance for scientific codes. Prior research has shown that changing data layouts (e.g., interleaving arrays) can improve performance. ...
Communication Optimizations for Distributed-Memory X10 Programs
X10 is a new object-oriented PGAS (Partitioned Global Address Space) programming language with support for distributed asynchronous dynamic parallelism that goes beyond past SPMD message-passing models such as MPI and SPMD ...
The Concurrent Collections Programming Model
Parallel computing has become firmly established since the 1980’s as the primary means of achieving high performance from supercomputers. 1 Concurrent Collections (CnC) was developed to address the need for making parallel ...
BMS-CnC: Bounded Memory Scheduling of Dynamic Task Graphs
It is now widely recognized that increased levels of parallelism is a necessary condition for improved application performance on multicore computers. However, as the number of cores increases, the memory-per-core ratio ...
Scalable and Precise Dynamic Datarace Detection for Structured Parallelism
Existing dynamic race detectors suffer from at least one of the following three limitations: i) space overhead per memory location grows linearly with the number of parallel threads , severely limiting the parallelism ...
Automatic Detection of Inter-application Permission Leaks in Android Applications
Due to their growing prevalence, smartphones can access an increasing amount of sensitive user information. To better protect this information, modern mobile operating systems provide permission-based security, which ...
Interprocedural Strength Reduction of Critical Sections in Explicitly-Parallel Programs
In this paper, we introduce novel compiler optimization techniques to reduce the number of operations performed in critical sections that occur in explicitly-parallel programs. Specifically, we focus on three code ...
Support for Complex Numbers in Habanero