The Automatic Grouping of Sensor Data Layers Using Semantic Clustering and Classification to Group Semantically Similar Sensor Data Layers

atmire.migration.oldid554
dc.contributor.advisorLiang, Steve
dc.contributor.authorKnoechel, Ben Charles
dc.date.accessioned2013-01-25T17:52:00Z
dc.date.available2013-06-15T07:01:36Z
dc.date.issued2013-01-25
dc.date.submitted2013en
dc.description.abstractThe Sensor Web is a growing phenomenon where an increasing number of sensors are collecting data in the physical world, to be made available over the Internet. Open standards have been proposed and are being implemented to eliminate the problem of semantic interoperability, the goal being to allow systems to share data automatically. Spatial Data Infrastructures (SDIs) are tools that have been developed to manage geospatial data from many different sources. However, there are still problems with interoperability associated with a lack of standardized naming, even with data collected using the same open standard. The objective of this thesis is to automatically group similar sensor data layers. We propose a methodology to automatically group similar sensor data layers based on the phenomenon they measure. Our methodology is based on a unique bottom up approach that uses text processing, approximate string matching, and semantic string matching of data layers. Text processing includes normalization and tokenization to standardize syntactic differences in the naming. Approximate string matching techniques include Levenshtein Distance, a Length Adjusted Levenshtein Dissimilarity, Jaro Dissimilarity, JaroWinkler Dissimilarity, Jaccard Dissimilarity, and Cosine Dissimilarity. For semantic string matching, we use WordNet as a lexical database to compute word pair similarities and derive a set-based dissimilarity function using those similarity scores. These string matching algorithms are used to produce dissimilarity values between data layers, which are in turn used to provide data layer to data layer mappings, similar data layer clusters, and mapping between a set of class names and data layers. For clustering, we tested three different clustering algorithms, K-Medoids, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Hierarchical Agglomerative Clustering (HAC). We evaluate and discuss the results of our methodology, and introduce a proof of concept Virtual SOS service to show the utility of such research.en_US
dc.identifier.citationKnoechel, B. C. (2013). The Automatic Grouping of Sensor Data Layers Using Semantic Clustering and Classification to Group Semantically Similar Sensor Data Layers (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. doi:10.11575/PRISM/28017en_US
dc.identifier.doihttp://dx.doi.org/10.11575/PRISM/28017
dc.identifier.urihttp://hdl.handle.net/11023/480
dc.language.isoeng
dc.publisher.facultyGraduate Studies
dc.publisher.institutionUniversity of Calgaryen
dc.publisher.placeCalgaryen
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subjectGeotechnology
dc.titleThe Automatic Grouping of Sensor Data Layers Using Semantic Clustering and Classification to Group Semantically Similar Sensor Data Layers
dc.typemaster thesis
thesis.degree.disciplineGeomatics Engineering
thesis.degree.grantorUniversity of Calgary
thesis.degree.nameMaster of Science (MSc)
ucalgary.item.requestcopytrue
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2013_ben_knoechel.pdf
Size:
1.28 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.65 KB
Format:
Item-specific license agreed upon to submission
Description: