Structure Aware XML Object Identification

Milano, D.; Scannapieco, M.; Catarci, T.
Clean DB, 2006

The object identification problem is particularly
hard for XML data, due to its structural flexibility. Tree edit distances have been
proposed for approximate comparisons among
XML trees. However, such distances ignore
the semantics implicit in XML data structure,
and their use is computationally infeasible for
unordered data. In this paper, we define a new
distance for XML data, the structure aware
XML distance, that overcomes these issues,
together with a polynomial-time algorithm to
calculate it, and we present experimental result
that prove its effectiveness and efficiency.

