Declarative Data Fusion - Syntax, Semantics, and Implementation

Authors: 
Bleiholder, Jens; Naumann, Felix
Author: 
Bleiholder, J
Naumann, F
Year: 
2005
Venue: 
Advances in Databases and Information Systems (ADBIS) 2005
URL: 
http://www.hpi.uni-potsdam.de/fileadmin/hpi/FG_Naumann/publications/ADBIS05.pdf
Citations: 
0
Citations range: 
n/a
AttachmentSize
Bleiholder2005DeclarativeDataFusion.pdf249.69 KB

In today’s integrating information systems data fusion, i.e., the merging of multiple tuples about the same real-world object into a single tuple, is left to ETL tools and other specialized software. While much attention has been paid to architecture, query languages, and query execution, the final step of actually fusing data from multiple sources into a consistent and homogeneous set is often ignored.
This paper states the formal problem of data fusion in relational databases and discusses which parts of the problem can already be solved with standard Sql. To bridge the final gap, we propose the SQL Fuse By statement and define its syntax and semantics. A first implementation of the statement in a prototypical database system shows the usefulness and feasibility of the new operator.