Show simple item record

dc.contributor.advisor Sarkar, Vivek
dc.creatorTasirlar, Sagnak
dc.date.accessioned 2016-01-27T17:37:11Z
dc.date.available 2016-01-27T17:37:11Z
dc.date.created 2015-05
dc.date.issued 2015-11-06
dc.date.submitted May 2015
dc.identifier.citation Tasirlar, Sagnak. "Optimized Event-Driven Runtime Systems for Programmability and Performance." (2015) Diss., Rice University. https://hdl.handle.net/1911/88175.
dc.identifier.urihttps://hdl.handle.net/1911/88175
dc.description.abstract Modern parallel programming models perform their best under the particular patterns they are tuned to express and execute, such as OpenMP for fork/join and Cilk for divide-and-conquer patterns. In cases where the model does not fit the problem, shoehorning of the problem to the model leads to performance bottlenecks, for example by introducing unnecessary dependences. In addition, some of these models, like MPI, have a performance model which thinly veils a particular machine's parameters from the problem that is to be solved. We postulate that an expressive parallel programming model should not over-constrain the problem it expresses and should not require the application programmer to code for the underlying machine and sacrifice portability. In our former work, we proposed the Data-Driven Tasks model, which constitutes expressive and portable parallelism by only requiring the application programmer to declare the inherent dependences of the application. In this work, we observe another instantiation of macro-dataflow, the Open Community Runtime (OCR) with work-stealing support for directed-acyclic graph (DAG) parallelism. First, we assess the benefits of these macro-dataflow models over traditional fork/join models using work-stealing, where we match the performance of hand-tuned parallel libraries on today's architectures through DAG parallelism. Secondly, we address work-stealing granularity optimizations for DAG parallelism to address how work stealing can be extended to perform better under complex dependence graphs. Lastly, we observe the impact of locality optimizations for work-stealing runtimes for DAG-parallel applications. On our path to exascale computations, the priority is shifting from minimizing latency to energy saving as the current trend makes powering an exascale machine very challenging. The trend of providing more parallelism to fit power budgets succeeds if applications can be declared to be more parallel and also scale. We argue that macro-dataflow is a framework that allows programmers to declare unconstrained parallelism. We provide an underlying work-stealing runtime to execute this framework for load balance and scalability, and propose heuristics to extend the default work-stealing approach to better perform with DAG parallel programs. We present our results on a multi-socket many-core machine and a many-core accelerator to showcase the feasibility of our approach on architectures signaling what future architectures may resemble.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.subjectevent driven
macro dataflow
work stealing
DAG parallelism
dc.title Optimized Event-Driven Runtime Systems for Programmability and Performance
dc.contributor.committeeMember Cooper, Keith D
dc.contributor.committeeMember Zhong, Lin
dc.date.updated 2016-01-27T17:37:11Z
dc.type.genre Thesis
dc.type.material Text
thesis.degree.department Computer Science
thesis.degree.discipline Engineering
thesis.degree.grantor Rice University
thesis.degree.level Doctoral
thesis.degree.name Doctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record