ShuFFLE: Automated Framework for HArdware Accelerated Iterative Big Data Analysis
Mohammadgholi Songhori, Ebrahim
Master of Science
This thesis introduces ShuFFLE, a set of novel methodologies and tools for automated analysis and hardware acceleration of large and dense (non-sparse) Gram matrices. Such matrices arise in most contemporary data mining; they are hard to handle because of the complexity of known matrix transformation algorithms and the inseparability of non-sparse correlations. ShuFFLE learns the properties of the Gram matrices and their rank for each particular application domain. It then utilizes the underlying properties for reconfiguring accelerators that scalably operate on the data in that domain. The learning is based on new factorizations that work at the limit of the matrix rank to optimize the hardware implementation by minimizing the costly off-chip memory as well as I/O interactions. ShuFFLE also provides users with a new Application Programming Interface (API) to implement a customized iterative least squares solver for analyzing big and dense matrices in a scalable way. This API is readily integrated within the Xilinx Vivado High Level Synthesis tool to translate user's code to Hardware Description Language (HDL). As a case study, we implement Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) as an l1 regularized least squares solver. Experimental results show that during FISTA computation using Field-Programmable Gate Array (FPGA) platform, ShuFFLE attains 1800x iteration speed improvement compared to the conventional solver and about 24x improvement compared to our factorized solver on a general purpose processor with SSE4 architecture for a Gram matrix with 4.6 billion non-zero elements.