BNOSA: A Bayesian Network and Ontology based Semantic Annotation Framework

Quratulain Rajput, Sajjad Haider

Abstract


The paper presents a semantic annotation framework that is capable of extracting relevant
information from unstructured, ungrammatical and incoherent data sources. The framework, named
BNOSA, uses ontology to conceptualize a problem domain and to extract data from the given corpora,
and Bayesian networks to resolve conflicts and to predict missing data. The framework is extensible as
it is capable of dynamically extracting data from any problem domain given a pre-defined ontology and
a corresponding Bayesian network. Experiments have been conducted to analyze the performance of
BNOSA on several problem domains. The sets of corpora used in the experiments belong to sellingpurchasing
websites where product information is entered by ordinary web users in a structure-free
format. The results show that BNOSA performs reasonably well to find location of the data of interest
using context keywords provided as part of the domain ontology. In case of more than one value being
extracted for an attribute or if the value is missing, Bayesian networks identify the most appropriate
value for that attribute.

Full Text: PDF
Show BibTex format: BibTeX