Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Special Issue on Benchmarking Linked Data
Guest editors: Axel-Cyrille Ngonga Ngomo, Irini Fundulaki and Anastasia Krithara
Article type: Research Article
Authors: Rashid, Mohammada; * | Torchiano, Marcoa | Rizzo, Giuseppeb | Mihindukulasooriya, Nandanac | Corcho, Oscarc
Affiliations: [a] Politecnico di Torino, Italy. E-mails: [email protected], [email protected] | [b] Instituto Superiore Mario Boella, Italy. E-mail: [email protected] | [c] Universidad Politecnica de Madrid, Spain. E-mails: [email protected], [email protected]
Correspondence: [*] Corresponding author. E-mail: [email protected].
Abstract: Knowledge bases are nowadays essential components for any task that requires automation with some degrees of intelligence. Assessing the quality of a knowledge base is a complex task as it often means measuring the quality of structured information, ontologies and vocabularies, and queryable endpoints. Popular knowledge bases such as DBpedia, YAGO2, and Wikidata have chosen the RDF data model to represent their data due to its capabilities for semantically rich knowledge representation. Despite its advantages, there are challenges in using RDF data model, for example, data quality assessment and validation. In this paper, we present a novel knowledge base quality assessment approach that relies on evolution analysis. The proposed approach uses data profiling on consecutive knowledge base releases to compute quality measures that allow detecting quality issues. Our quality characteristics are based on the evolution analysis and we used high-level change detection for measurement functions. In particular, we propose four quality characteristics: Persistency, Historical Persistency, Consistency, and Completeness. Persistency and historical persistency measure the degree of changes and lifespan of any entity type. Consistency and completeness identify properties with incomplete information and contradictory facts. The approach has been assessed both quantitatively and qualitatively on a series of releases from two knowledge bases, eleven releases of DBpedia and eight releases of 3cixty. The capability of Persistency and Consistency characteristics to detect quality issues varies significantly between the two case studies. Persistency gives observational results for evolving knowledge bases. It is highly effective in case of knowledge bases with periodic updates such as the 3cixty one. The Completeness characteristic is extremely effective and was able to achieve 95% precision in error detection for both use cases. The measures are based on simple statistical operations that make the solution both flexible and scalable.
Keywords: Quality assessment, quality issues, evolution analysis, knowledge base, linked data
DOI: 10.3233/SW-180324
Journal: Semantic Web, vol. 10, no. 2, pp. 349-383, 2019
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]