2nd International ICST Conference on Scalable Information Systems

Research Article

Scalable Keyword Search Based on Semantic in DHT Based Peer-to-Peer System

Download410 downloads
  • @INPROCEEDINGS{10.4108/infoscale.2007.940,
        author={Wenhui Ma and Wenfang Wang and Jing Liu},
        title={Scalable Keyword Search Based on Semantic in DHT Based Peer-to-Peer System},
        proceedings={2nd International ICST Conference on Scalable Information Systems},
        proceedings_a={INFOSCALE},
        year={2010},
        month={5},
        keywords={Peer-to-Peer Distributed Hash Tables ontology search},
        doi={10.4108/infoscale.2007.940}
    }
    
  • Wenhui Ma
    Wenfang Wang
    Jing Liu
    Year: 2010
    Scalable Keyword Search Based on Semantic in DHT Based Peer-to-Peer System
    INFOSCALE
    ICST
    DOI: 10.4108/infoscale.2007.940
Wenhui Ma1,*, Wenfang Wang1,*, Jing Liu1,*
  • 1: College of Information Science and Technology, University of Nankai Tianjin 300071, China
*Contact email: wenhuima_nk@hotmail.com, wwwfonline@eyou.com, jingliu@nankai.edu.cn

Abstract

The common way for keyword search in Distributed Hash Tables (DHTs) based Peer-to-Peer (P2P) system is to construct distributed inverted index by keywords. But it suffers from the problem of unscalable resources (e.g. bandwidth, storage) consumption. In this paper, we present SKS, a scalable keyword search approach in DHTs based P2P system. SKS introduces the ontology to organize the specific domain, which captures the semantic relations between words. SKS constructs distributed inverted index by concepts, which decreases the number of index entries publishing for documents and avoids the intersection of inverted lists between nodes when executing multi-keyword search. With the concept index SKS transforms the keyword search to the match process of concepts, implementing semantic search. Simulation experiment shows that SKS is more efficient than the approach of distributed inverted index by keywords in indices publishing overhead and query overhead.