AJAX: an extensible data cleaning tool

Authors: 
Galhardas, H; Florescu, D; Shasha, D; Simon, E
Author: 
Galhardas, H
Florescu, D
Shasha, D
Simon, E
Year: 
2000
Venue: 
ACM SIGMOD Record
URL: 
http://portal.acm.org/citation.cfm?id=336568
Citations: 
175
Citations range: 
100 - 499
AttachmentSize
Galhardas2000AJAXanextensibledata.pdf87.06 KB

... groups together matching pairs with a high similarity value by applying a given grouping criteria (e.g. by transitive closure). Finally, ging collapses each individual cluster into a tuple of the resulting data source. AJAX provides @@@@ for specifying data cleaning programs, which consists of SQL statements enriched with a set of specific primitives to express these transformations.AJAX also @@@@. It allows the user to interact with an executing data cleaning program to handle exceptional cases and to inspect intermediate results. Finally, AJAX provides @@@@ @@@@ that permits users to determine the source and processing of data for debugging purposes.We will present the AJAX system applied to two real world problems: the consolidation of a telecommunication database, and the conversion of a dirty database of bibliographic references into a set of clean, normalized, and redundancy free relational tables maintaining the same data.