Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework
atmire.migration.oldid | 4489 | |
dc.contributor.advisor | Far, Behrouz H. | |
dc.contributor.author | Esmaeilpour, Arina | |
dc.date.accessioned | 2016-06-13T21:32:48Z | |
dc.date.available | 2016-06-13T21:32:48Z | |
dc.date.issued | 2016 | |
dc.date.submitted | 2016 | en |
dc.description.abstract | With an accelerating rate of data generation, sophisticated techniques are essential to meet scalability requirements. One of the promising avenues for handling large datasets is distributed storage and processing. Hadoop is a well-known framework for distributed storage and processing. Further, data summarization is a useful concept for managing large datasets. Data summarization techniques are intended to produce compact yet representative summaries for the entire dataset. Consolidation of these tools can allow a distributed implementation of data summarization. In this thesis, this goal is achieved by proposing and implementing a distributed Gaussian Mixture Model Summarization using the MapReduce framework (MR-SGMM). The main purpose of the proposed method is to summarize a dataset with a density-based clustering algorithm called DBSCAN algorithm, and then summarize each discovered cluster using the SGMM approach in a distributed manner. Testing the implementation with synthetic and real datasets is used to demonstrate its validity and efficiency. | en_US |
dc.identifier.citation | Esmaeilpour, A. (2016). Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. doi:10.11575/PRISM/25727 | en_US |
dc.identifier.doi | http://dx.doi.org/10.11575/PRISM/25727 | |
dc.identifier.uri | http://hdl.handle.net/11023/3053 | |
dc.language.iso | eng | |
dc.publisher.faculty | Graduate Studies | |
dc.publisher.institution | University of Calgary | en |
dc.publisher.place | Calgary | en |
dc.rights | University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. | |
dc.subject | Artificial Intelligence | |
dc.subject | Computer Science | |
dc.subject.classification | Distributed density-based clustering | en_US |
dc.subject.classification | Distributed cluster summariztion | en_US |
dc.subject.classification | Gaussian mixture model | en_US |
dc.subject.classification | MapReduce | en_US |
dc.title | Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework | |
dc.type | master thesis | |
thesis.degree.discipline | Electrical and Computer Engineering | |
thesis.degree.grantor | University of Calgary | |
thesis.degree.name | Master of Science (MSc) | |
ucalgary.item.requestcopy | true |