Towards Next Generation CiteSeer: A Flexible Architecture for Digital Library Deployment

Councill, Isaac G.; Giles, C. Lee; Iorio, Ernesto Di; Gori, Marco; Maggini, Marco; Pucci, Augusto
Research and Advanced Technology for Digital Libraries, 10th European Conference, (ECDL 2006): 111-122, 2006
CiteSeer began as the first search engine for scientific litera-
ture to incorporate Autonomous Citation Indexing, and has since grown
to be a well-used, open archive for computer and information science pub-
lications, currently indexing over 730,000 academic documents. However,
CiteSeer currently faces significant challenges that must be overcome in
order to improve the quality of the service and guarantee that Cite-
Seer will continue to be a valuable, up-to-date resource well into the
foreseeable future. This paper describes a new architectural framework
for CiteSeer system deployment, named CiteSeer Plus. The new frame-
work supports distributed indexing and storage for load balancing and
fault-tolerance as well as modular service deployment to increase system
flexibility and reduce maintenance costs. In order to facilitate novel ap-
proaches to information extraction, a blackboard framework is built into
the architecture.