Completeness of Information Sources

Authors: 
Naumann, Felix; Freytag, Johann-Christoph; Leser, Ulf
Author: 
Naumann, F
Freytag, J
Leser, U
Year: 
2004
Venue: 
Information Systems 29(7):583-615
URL: 
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0G-4BMJ9C7-1&_user=964000&_handle=B-WA-A-B-WZ-MsSAYVW-UUA-AUYYEBAWVE-AUYZCAWUVE-VVEYWCBEB-WZ-U&_fmt=full&_coverDate=03%2F03%2F2004&_rdoc=9&_orig=browse&_srch=%23toc%235646%239999%23999999999%23999
Citations: 
18
Citations range: 
10 - 49

— Information quality plays a crucial role in every ap- plication that integrates data from autonomous sources. However, information quality is hard to measure and complex to consider for the tasks of information integration, even if the integrating sources cooperate. We present a systematic and formal approach to the measurement of information quality and the combination of such measurements for information integration. Our approach is based on a value model that incorporates both extensional value (coverage) and intensional value (density) of information. For both aspects we provide merge functions for adequately scoring integrated results. Also, we combine the two criteria to an overall completeness criterion that formalizes the intuitive notion of completeness of query results. This completeness measure is a valuable tool to assess source size and to predict result sizes of queries in integrated information systems. We propose this measure as an important step towards the usage of information quality for source selection, query planning, query optimization, and quality feedback to users.