Rice Univesrity Logo
    • FAQ
    • Deposit your work
    • Login
    View Item 
    •   Rice Scholarship Home
    • Faculty & Staff Research
    • George R. Brown School of Engineering
    • Computer Science
    • Computer Science Technical Reports
    • View Item
    •   Rice Scholarship Home
    • Faculty & Staff Research
    • George R. Brown School of Engineering
    • Computer Science
    • Computer Science Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    RCMP: A System Enabling Efficient Re-computation Based Failure Resilience for Big Data Analytics

    Thumbnail
    Name:
    TR13-04.pdf
    Size:
    516.9Kb
    Format:
    PDF
    View/Open
    Author
    Dinu, Florin; Ng, T. S. Eugene
    Date
    April 30, 2013
    Abstract
    Multi-job I/O-intensive big-data computations can suffer a significant performance hit due to relying on data replication as the main failure resilience strategy. Data replication is inherently an expensive operation for I/O-intensive jobs because the datasets to be replicated are very large. Moreover, since the failure resilience guarantees provided by replication are fundamentally limited by the number of available replicas, jobs may fail when all replicas are lost. In this paper we argue that job re-computation should also be a first-order failure resilience strategy for big data analytics. Recomputation support is especially important for multi-job computations because they can require cascading re-computations to deal with the data loss caused by failures. We propose RCMP, a system that performs efficient job re-computation. RCMP improves on state-of-the-art big data processing systems which rely on data replication and consequently lack any dedicated support for recomputation. RCMP can speed-up a job’s re-computation by leveraging outputs that it stored during that job’s successful run. During re-computation, RCMP can efficiently utilize the available compute node parallelism by switching to a finer-grained task scheduling granularity. Furthermore, RCMP can mitigate hot-spots specific to re-computation runs. Our experiments on a moderate-sized cluster show that compared to using replication, RCMP can provide significant benefits during failure-free periods while still finishing multijob computations in comparable or better time when impacted by single and double data loss events.
    Citation
    Dinu, Florin and Ng, T. S. Eugene. "RCMP: A System Enabling Efficient Re-computation Based Failure Resilience for Big Data Analytics." (2013) https://hdl.handle.net/1911/96407.
    Type
    Technical report
    Citable link to this page
    https://hdl.handle.net/1911/96407
    Rights
    You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
    Metadata
    Show full item record
    Collections
    • Computer Science Technical Reports [245]

    Home | FAQ | Contact Us | Privacy Notice | Accessibility Statement
    Managed by the Digital Scholarship Services at Fondren Library, Rice University
    Physical Address: 6100 Main Street, Houston, Texas 77005
    Mailing Address: MS-44, P.O.BOX 1892, Houston, Texas 77251-1892
    Site Map

     

    Searching scope

    Browse

    Entire ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeThis CollectionBy Issue DateAuthorsTitlesSubjectsType

    My Account

    Login

    Statistics

    View Usage Statistics

    Home | FAQ | Contact Us | Privacy Notice | Accessibility Statement
    Managed by the Digital Scholarship Services at Fondren Library, Rice University
    Physical Address: 6100 Main Street, Houston, Texas 77005
    Mailing Address: MS-44, P.O.BOX 1892, Houston, Texas 77251-1892
    Site Map