Click a term to initiate a search.
When integrating information from multiple websites, the same data objects can
exist in inconsistent text formats across sites, making it di±cult to identify match-
ing objects using exact text match. We have developed an object identi¯cation
system called Active Atlas, which compares the objects' shared attributes in order
to identify matching objects. Certain attributes are more important for decid-
ing if a mapping should exist between two objects. Previous methods of object
identi¯cation have required manual construction of object identi¯cation rules or
mapping rules for determining the mappings between objects, as well as domain-
dependent transformations for recognizing format inconsistencies. This manual
process is time consuming and error-prone. In our approach, Active Atlas learns
to simultaneously tailor both mapping rules and a set of general transformations
to a speci¯c application domain, through limited user input. The experimen-
tal results demonstrate that we achieve higher accuracy and require less user
involvement than previous methods across various application domains.