Show simple item record

dc.contributor.advisor Jermaine, Christopher
dc.creatorSikdar, Sourav
dc.date.accessioned 2017-08-01T16:34:43Z
dc.date.available 2017-08-01T16:34:43Z
dc.date.created 2016-12
dc.date.issued 2017-06-07
dc.date.submitted December 2016
dc.identifier.citation Sikdar, Sourav. "An Experimental Comparison of Complex Objects Implementations in Big Data Systems." (2017) Master’s Thesis, Rice University. http://hdl.handle.net/1911/96019.
dc.identifier.urihttp://hdl.handle.net/1911/96019
dc.description.abstract Many data management and analytics systems support complex objects. Dataflow platforms such as Spark and Flink allow programmers to manipulate sets consisting of objects from a host programming language, often Java. Document databases such as MongoDB make use of hierarchical interchange formats--most popularly JSON--which embody a data model where individual records can themselves contain sets of records. Systems such as Dremel and AsterixDB allow complex nesting of data structures. The desire to support such complex objects forces a system designer to ask: how should complex objects be implemented in a modern data management system? In this thesis, over a suite of representative data management tasks, I experimentally evaluate the performance implications of a wide variety of complex object implementations. The choice of object implementation can have a profound effect on performance. For example, the same external sort to perform a duplicate removal can take anywhere between a half hour to fourteen and a half hours depending upon the complex object implementation. A corollary is that a bad object implementation can doom system performance. In addition, we reaffirm the value of the classical database way of storing complex objects - where there is no distinction between the in-memory and over-the-wire data representation, within a modern big data system.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.subjectComplex Objects Implementations
Experimental Evaluation
Big Data Systems
dc.title An Experimental Comparison of Complex Objects Implementations in Big Data Systems
dc.date.updated 2017-08-01T16:34:43Z
dc.type.genre Thesis
dc.type.material Text
thesis.degree.department Computer Science
thesis.degree.discipline Engineering
thesis.degree.grantor Rice University
thesis.degree.level Masters
thesis.degree.name Master of Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record