Research Article
Link-Based Ranking of the Web with Source-Centric Collaboration
@INPROCEEDINGS{10.1109/COLCOM.2006.361840, author={James Caverlee and Ling Liu and William B. Rouse}, title={Link-Based Ranking of the Web with Source-Centric Collaboration}, proceedings={2nd International ICST Conference on Collaborative Computing: Networking, Applications and Worksharing}, publisher={IEEE}, proceedings_a={COLLABORATECOM}, year={2007}, month={5}, keywords={Algorithm design and analysis Collaboration Computer applications Computer displays Couplings Humans Image databases Large-scale systems Paper technology Web pages}, doi={10.1109/COLCOM.2006.361840} }
- James Caverlee
Ling Liu
William B. Rouse
Year: 2007
Link-Based Ranking of the Web with Source-Centric Collaboration
COLLABORATECOM
IEEE
DOI: 10.1109/COLCOM.2006.361840
Abstract
Web ranking is one of the most successful and widely used collaborative computing applications, in which Web pages collaborate in the form of varying degree of relationships to assess their relative quality. Though many observe that links display strong source-centric locality, for example, in terms of administrative domains and hosts, most Web ranking analysis to date has focused on the flat page-level Web linkage structure. In this paper we develop a framework for link-based collaborative ranking of the Web by utilizing the strong Web link structure. We argue that this source-centric link analysis is promising since it captures the natural link-locality structure of the Web, can provide more appealing and efficient Web applications, and reflects many natural types of structured human collaborations. Concretely, we propose a generic framework for source-centric collaborative ranking of the Web. This paper makes two unique contributions. First, we provide a rigorous study of the set of critical parameters that can impact source-centric link analysis, such as source size, the presence of self-links, and different source-citation link weighting schemes (e.g., uniform, link count, source consensus). Second, we conduct a large-scale experimental study to understand how different parameter settings may impact the time complexity, stability, and spam-resilience of Web ranking. We find that careful tuning of these parameters is vital to ensure success over each objective and to balance the performance across all objectives.