Rice Univesrity Logo
    • FAQ
    • Deposit your work
    • Login
    View Item 
    •   Rice Scholarship Home
    • Rice University Graduate Electronic Theses and Dissertations
    • Rice University Electronic Theses and Dissertations
    • View Item
    •   Rice Scholarship Home
    • Rice University Graduate Electronic Theses and Dissertations
    • Rice University Electronic Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    An Experimental Comparison of Complex Objects Implementations in Big Data Systems

    Thumbnail
    Name:
    SIKDAR-DOCUMENT-2016.pdf
    Size:
    515.2Kb
    Format:
    PDF
    View/Open
    Author
    Sikdar, Sourav
    Date
    2017-06-07
    Advisor
    Jermaine, Christopher
    Degree
    Master of Science
    Abstract
    Many data management and analytics systems support complex objects. Dataflow platforms such as Spark and Flink allow programmers to manipulate sets consisting of objects from a host programming language, often Java. Document databases such as MongoDB make use of hierarchical interchange formats--most popularly JSON--which embody a data model where individual records can themselves contain sets of records. Systems such as Dremel and AsterixDB allow complex nesting of data structures. The desire to support such complex objects forces a system designer to ask: how should complex objects be implemented in a modern data management system? In this thesis, over a suite of representative data management tasks, I experimentally evaluate the performance implications of a wide variety of complex object implementations. The choice of object implementation can have a profound effect on performance. For example, the same external sort to perform a duplicate removal can take anywhere between a half hour to fourteen and a half hours depending upon the complex object implementation. A corollary is that a bad object implementation can doom system performance. In addition, we reaffirm the value of the classical database way of storing complex objects - where there is no distinction between the in-memory and over-the-wire data representation, within a modern big data system.
    Keyword
    Complex Objects Implementations; Experimental Evaluation; Big Data Systems
    Citation
    Sikdar, Sourav. "An Experimental Comparison of Complex Objects Implementations in Big Data Systems." (2017) Master’s Thesis, Rice University. https://hdl.handle.net/1911/96019.
    Metadata
    Show full item record
    Collections
    • Rice University Electronic Theses and Dissertations [13409]

    Home | FAQ | Contact Us | Privacy Notice | Accessibility Statement
    Managed by the Digital Scholarship Services at Fondren Library, Rice University
    Physical Address: 6100 Main Street, Houston, Texas 77005
    Mailing Address: MS-44, P.O.BOX 1892, Houston, Texas 77251-1892
    Site Map

     

    Searching scope

    Browse

    Entire ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeThis CollectionBy Issue DateAuthorsTitlesSubjectsType

    My Account

    Login

    Statistics

    View Usage Statistics

    Home | FAQ | Contact Us | Privacy Notice | Accessibility Statement
    Managed by the Digital Scholarship Services at Fondren Library, Rice University
    Physical Address: 6100 Main Street, Houston, Texas 77005
    Mailing Address: MS-44, P.O.BOX 1892, Houston, Texas 77251-1892
    Site Map