Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Doctor of Philosophy

Program

Electrical and Computer Engineering

Supervisor

McIsaac, Kenneth

Abstract

Collaborative intelligence in the context of information management can be defined as "A shared intelligence that results from the collaboration between various information systems". In open environments, these collaborating information systems can be heterogeneous, dynamic and loosely-coupled. Information systems in open environment can also possess a certain degree of autonomy. The integration of data residing in various heterogeneous information systems is essential in order to drive the intelligence efficiently and accurately. Because of the heterogeneous, loosely-coupled, and dynamic nature of open environment, the integration between these information systems in the data level is not efficient. Several approaches and models have been proposed in order to perform the task of data integration. Many of the existing approaches for data integration are designed for closed environment, tightly-coupled systems and enterprise data integration. They make explicit, or implicit, assumptions about the semantic structure of the data. Because of the heterogeneous and loosely-coupled nature of open environment, such assumptions are deemed unintuitive. Data integration approaches based on model that are extensional in nature are also inadequate for open environment. This is because they do not account for the dynamic nature of open environment. The need for an adequate model for describing data integration systems in open environment is quite evident. Intensional based modeling is found to be an adequate and natural choice for modeling in open environment. This is because it addresses the dynamic and loosely-coupled nature of open environment. In this work, an intensional model for the conceptualization is presented. This model is based on the theory of Properties Relations and Propositions (PRP). The proposed description takes the concepts, relations, and properties as primitive and as such, irreducible entities. The formal intensional account of both Ontology and Ontological Commitment are also proposed in light of the intensional model for conceptualization. An intensional model for ontology-driven mediated data integration in open environment is also proposed. The proposed model accounts for the dynamic nature of open environment and also intensionally describes the information of data sources. The interface between global and local ontologies and the formal intensional semantics of the query answering are then described.

Summary for Lay Audience

In today’s world, data can be found anywhere, databases, web pages, email inboxes, and many more types of data sources. Some of these data sources are structured, i.e. they have tables and fields, like the case with databases. Other data sources are unstructured. This is the case with information that reside on a webpage or in your email inbox. This means that these data sources are heterogeneous. Another factor that affects the heterogeneity is the fact that, even the structured data sources, is created by different parties. These various parties created their data sources with different needs in mind. And so, they tailored the data source to satisfy these particular needs. When it comes to generating intelligence for the purpose of driving decision making, one should attempt to take advantage of all available data sources. For example, it has been found that most of the information about customer satisfaction/frustration with a business can’t be found in an enterprise database. Rather, most of this information is on web pages, blogs, forums, or in the email inbox of a customer care representative. Nowadays also the communication on the web is very dynamic. Agents, computers, phones, servers, and other equipments can connect/disconnect from the web at anytime. This is an example for what we refer to as an open environment. In open environment agent can enter and leave the environment at anytime and the environment should still continue to function. As mentioned earlier, in order to generate intelligence, one should attempt to utilize the data from various data sources. In order to do so, the data from the various data sources need to be aligned and combined somehow. This can be referred to as data integration. In this work, we propose a model for data integration that accounts for the characteristics of what is referred to earlier as open environment.

Share

COinS