About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
3rd International ICST Conference on Scalable Information Systems

Research Article

Collection Selection ...now, with more documents!

Download664 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/ICST.INFOSCALE2008.3515,
        author={Diego Puppin},
        title={Collection Selection ...now, with more documents!},
        proceedings={3rd International ICST Conference on Scalable Information Systems},
        publisher={ICST},
        proceedings_a={INFOSCALE},
        year={2010},
        month={5},
        keywords={Distributed IR Collection Selection Index Update Web Search Engines},
        doi={10.4108/ICST.INFOSCALE2008.3515}
    }
    
  • Diego Puppin
    Year: 2010
    Collection Selection ...now, with more documents!
    INFOSCALE
    ICST
    DOI: 10.4108/ICST.INFOSCALE2008.3515
Diego Puppin1,2,*
  • 1: ISTI-CNR (Pisa, Italy)
  • 2: Google (Boston, USA)
*Contact email: diego.puppin@alum.mit.edu

Abstract

A way to reduce the computing pressure in a distributed IR system is to use document partitioning and to perform collection selection. With suitable training and/or modeling, the collection selection function can choose the most promising collections for each query, with high confidence. Unfortunately, if the collections need to be updated, we need to retrain the selection function, update its statistics or face the loss of some result quality. This paper introduces a simple, but very effective, technique to add new documents to collections in a system that uses collection selection. We show that we can update the individual collections, while guaranteeing the same selection performance, with no need to update or retrain the selection function.

Keywords
Distributed IR Collection Selection Index Update Web Search Engines
Published
2010-05-16
Publisher
ICST
Modified
2010-05-16
http://dx.doi.org/10.4108/ICST.INFOSCALE2008.3515
Copyright © 2008–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL