bell-labs.com

Methods for linking and mining massive heterogeneous databases

Authors: 
Pinheiro, J.C.; Sun, D.X.
Year: 
1998
Venue: 
Fourth International conference on Knowledge Discovery and Data Mining, 1998

Many real-world KDD expeditions involve investigation of relationships between variables in
different, heterogeneous databases. We present
a dynamic programming technique for linking
records in multiple heterogeneous databases using loosely defined fields that allow free-style verbatim entries. We develop an interestingness
measure based on non-parametric randomization
tests, which can be used for mining potentially
useful relationships among variables. This mea-
sure uses distributional characteristics of historical events, hence accommodating variable-length

Syndicate content