On Building and Publishing Linked Open Schema from Social Web Sites

Tianxing Wu, Haofen Wang, Guilin Qi, Jiangang Zhu, Tong Ruan


Schema-level knowledge is important for different semantic applications, such as reasoning, data integration and question answering. Compared with billions of triples describing millions of instances, current Linking Open Data has only a limited number of triples representing schema-level knowledge. To facilitate multilingual schema-level knowledge mining, we propose a general approach to learn Linked Open Schema (LOS) in different languages from social Web sites, which contain rich sources (i.e. taxonomies composed of categories and folksonomies consisting of tags) for mining large-scale schema-level knowledge. The core part of the proposed approach is a semi-supervised learning method integrating rules to capture equal, subClassOf and relate relations among the collected categories and tags. We respectively apply the proposed approach to the selected English social Web sites and the Chinese ones, resulting in an English LOS and a Chinese LOS. We publish the English LOS and the Chinese one as open data on the Web with three access levels, i.e. data dump, lookup service and SPARQL endpoint. Experimental results show the high accuracy of the relations in the English LOS and the Chinese one. Compared with DBpedia, Yago, BabelNet, and Freebase, both the English LOS and the Chinese one not only have large-scale concepts, but also contain the largest number of subClassOf relations.

Full Text: Untitled
Type of Paper: Research Paper
Keywords: Linked Data, Linked Open Schema, Schema-Level Knowledge, Social Web Sites
Show BibTex format: BibTeX