BNOSA: A Bayesian Network and Ontology based Semantic Annotation Framework

Quratulain Rajput, Sajjad Haider

Abstract


The paper presents a semantic annotation framework that is capable of extracting relevant information from unstructured, ungrammatical and incoherent data sources. The framework, named BNOSA, uses ontology to conceptualize a problem domain and to extract data from the given corpora, and Bayesian networks to resolve conflicts and to predict missing data. The framework is extensible as it is capable of dynamically extracting data from any problem domain given a pre-defined ontology and a corresponding Bayesian network. Experiments have been conducted to analyze the performance of BNOSA on several problem domains. The sets of corpora used in the experiments belong to sellingpurchasing websites where product information is entered by ordinary web users in a structure-free format. The results show that BNOSA performs reasonably well to find location of the data of interest using context keywords provided as part of the domain ontology. In case of more than one value being extracted for an attribute or if the value is missing, Bayesian networks identify the most appropriate value for that attribute.

Full Text: PDF
Type of Paper: Research Paper
Keywords: Ontology, Information Extraction, Bayesian Network, Machine Learning, Semantic Annotation
Show BibTex format: BibTeX