Now showing items 1-5 of 5
Performance Optimizations for Software Transactional Memory
The transition from single-core processors to multi-core processors demands a change from sequential programming to concurrent programming for mainstream programmers. However, concurrent programming has long been widely ...
Function Shipping in a Scalable Parallel Programming Model
Increasingly, a large number of scientific and technical applications exhibit dynamically generated parallelism or irregular data access patterns. These applications pose significant challenges to achieving scalable ...
Exploring the potential for accelerating sparse matrix-vector product on a Processing-in-Memory architecture
As the importance of memory access delays on performance has mushroomed over the past few decades, researchers have begun exploring Processing-in-Memory (PIM) technology, which offers higher memory bandwidth, lower memory ...
Performance analysis for parallel programs from multicore to petascale
Cutting-edge science and engineering applications require petascale computing. Petascale computing platforms are characterized by both extreme parallelism (systems of hundreds of thousands to millions of cores) and hybrid ...
Expressiveness, programmability and portable high performance of global address space languages
The Message Passing Interface (MPI) is the library-based programming model employed by most scalable parallel applications today; however, it is not easy to use. To simplify program development, Partitioned Global Address ...