fetch.com

Learning domain-independent string transformation weights for high accuracy object identification

Authors: 
Tejada, S; Knoblock, CA; Minton, S
Year: 
2002
Venue: 
Proceedings of the eighth ACM SIGKDD international

The task of object identification occurs when integrating information from multiple websites. The same data objects can exist in inconsistent text formats across sites, making it difficult to identify matching objects using exact text match. Previous methods of object identification have required manual construction of domain-specific string transformations or manual setting of general transformation parameter weights for recognizing format inconsistencies. This manual process can be time consuming and error-prone.

Syndicate content