XML

XML Duplicate Detection Using Sorted Neighborhoods

Authors: 
Puhlmann, Sven; Weis, Melanie; Naumann, Felix
Year: 
2006
Venue: 
Conference on Extending Database Technology (EDBT) 2006

Detecting duplicates is a problem with a long tradition in many domains, such as customer relationship management and data warehousing. The problem is twofold: First define a suitable similarity measure, and second efficiently apply the measure to all pairs of objects. With the advent and pervasion of the XML data model, it is necessary to find new similarity measures and to develop efficient methods to detect duplicate elements in nested XML data.

DogmatiX tracks down duplicates in XML

Authors: 
Weis, M; Naumann, F
Year: 
2005
Venue: 
Proceedings of the 2005 ACM SIGMOD international conference

Duplicate detection is the problem of detecting different entries in a data source representing the same real-world entity. While research abounds in the realm of duplicate detection in relational data, there is yet little work for duplicates in other, more complex data models, such as XML.

Syndicate content