Getty's Synoname and its cousins: A survey of applications of personal name-matching algorithms

Borgman, CL; Siegfried, SL
Journal of the American Society for Information Science
The study reported in this article was commissioned by
the Getty Art History Information Program (AHIP) as a
background investigation of personal name-matching
programs in fields other than art history, for purposes of
comparing them and their approaches with AHIP’s SynonameTM
project. We review techniques employed in a
variety of applications, including art history, bibliography,
genealogy, commerce, and government, providing a
framework of personal name characteristics, factors in
selecting matching techniques, and types of applications.
Personal names, as data elements in information
systems, vary for a wide range of legitimate reasons,
including cultural and historical traditions, translation
and transliteration, reporting and recording variations,
as well as typographical and phonetic errors. Some
matching applications seek to link variants, while others
seek to correct errors. The choice of matching techniques
will vary in the amount of domain knowledge
about the names that is incorporated, the sources of
data, and the human and computing resources required.
Personal name-matching techniques may be included in
name authority work, information retrieval, or duplicate
detection, with some applications matching on name
only, and others combining personal names with other
data elements in record linkage techniques. We discuss
both phonetic- and pattern-matching techniques, reviewing
a range of implemented and proposed namematching
techniques in the context of these factors.