AUTOMATIC INTEGRATION OF RELATIONAL DATABASE SCHEMAS

Date
2000-10-16
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper focuses on capturing the semantics of data stored in databases with the goal of integrating data sources within a company, across a network, and even on the World-Wide Web. Our approach to capturing data semantics revolves around the definition of a standardized dictionary which provides terms for referencing and categorizing data. These standardized terms are then stored in semantic specifications called X-Specs which store metadata and semantic descriptions of the data. Using these semantic specifications, it becomes possible to integrate diverse data sources even though they were not originally designed to work together. The centralized version of the architecture is presented which allows for the independent integration of data source information (represented using X-Specs) into a unified view of the data. The architecture preserves full autonomy of the underlying databases which are transparently accessed by the user from a central portal. Distributing the architecture would by-pass the central portal and allow integration of web data sources to be performed by a user's browser. Such a system which achieves automatic integration of data sources would have a major impact on how the Web is used and delivered. Unlike wrapper or mediator systems which achieve data source integration by manually defining an integrated view, our architecture automatically constructs an integrated view from information independently provided by the data sources. Thus, the contribution is an algorithm for schema integration not just a methodology for accessing data sources whose knowledge has been precombined into mediated views. The integrated view is a hierarchy of concepts that is queried by semantic name. Thus, the system provides both logical and physical access transparency by mapping user queries on high-level concepts to physical schema elements in the underlying data sources. Notes: Joint released technical report. Released as TR-00-15 for the University of Manitoba, and 2000-662-14 for the University of Calgary.
Description
Keywords
Computer Science
Citation