Show simple item record

dc.contributor.authorDinu, Florin
Ng, T. S. Eugene
dc.date.accessioned 2017-08-02T22:03:10Z
dc.date.available 2017-08-02T22:03:10Z
dc.date.issued 2011-08-11
dc.identifier.urihttps://hdl.handle.net/1911/96398
dc.description.abstract Failures are common in today’s data center environment and can significantly impact the performance of important jobs running on top of large scale computing frameworks. In this paper we analyze Hadoop’s behavior under compute node and process failures. Surprisingly, we find that even a single failure can have a large detrimental effect on job running times. We uncover several important design decisions underlying this distressing behavior: the inefficiency of Hadoop’s statistical speculative execution algorithm, the lack of sharing failure information and the overloading of TCP failure semantics. We hope that our study will add new dimensions to the pursuit of robust large scale computing framework designs.
dc.format.extent 10 pp
dc.language.iso eng
dc.rights You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
dc.title Analysis of Hadoop’s Performance under Failures
dc.type Technical report
dc.date.note August 11, 2011
dc.identifier.digital TR11-05
dc.type.dcmi Text
dc.identifier.citation Dinu, Florin and Ng, T. S. Eugene. "Analysis of Hadoop’s Performance under Failures." (2011) https://hdl.handle.net/1911/96398.


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record