Personal names

Adaptive name matching in information integration

Bilenko, M; Mooney, R; Cohen, W; P Ravikumar, S
Intelligent Systems

Identifying approximately duplicate database records that refer to the same entity is essential for information integration. The authors compare and describe methods for combining and learning textual similarity measures for name matching.

Efficient topic-based unsupervised name disambiguation

Song, Y; Huang, J; Councill, IG; Li, J; Giles, CL
Proc. 2007 Conf. on Digital libraries

Name ambiguity is a special case of identity uncertainty where one person can be referenced by multiple name variations in different situations or even share the same name with other people. In this paper, we focus on the problem of disambiguating person names within web pages and scientific documents. We present an efficient and effective two-stage approach to disambiguate names. In the first stage, two novel topic-based models are proposed by extending two hierarchical Bayesian text models, namely Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA).

Personal Name Matching: New Test Collections and a Social Network based Approach.

Reuther, P
Tech. Report, Univ. Trier

This paper gives an overview of Personal Name Matching. Personal
name matching is of great importance for all applications that deal
with personal names. The problem with personal names is that they
are not unique and sometimes even for one name many variations
exist. This leads to the fact that databases on the one hand may
have several entries for one and the same person and on the other
hand have one entry for many different persons. For the evaluation
of Personal Name Matching algorithms test collections are of great

Managing the Quality of Person Names in DBLP

Reuther, P; Walter, B; Ley, M; Weber, A; Klink, S

Quality management is, not only for digital libraries, an important task in which many dimensions and different aspects have to be considered. The following paper gives a short overview on DBLP in which the data acquisition and maintenance process underlying DBLP is discussed from a quality point of view. The paper finishes with a new approach to identify erroneous person names.

Getty's Synoname and its cousins: A survey of applications of personal name-matching algorithms

Borgman, CL; Siegfried, SL
Journal of the American Society for Information Science

The study reported in this article was commissioned by
the Getty Art History Information Program (AHIP) as a
background investigation of personal name-matching
programs in fields other than art history, for purposes of
comparing them and their approaches with AHIP’s SynonameTM
project. We review techniques employed in a
variety of applications, including art history, bibliography,
genealogy, commerce, and government, providing a
framework of personal name characteristics, factors in
selecting matching techniques, and types of applications.

