About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Context-Aware Systems and Applications. First International Conference, ICCASA 2012, Ho Chi Minh City, Vietnam, November 26-27, 2012, Revised Selected Papers

Research Article

Improving Vietnamese Web Page Classification by Combining Hybrid Feature Selection and Label Propagation with Link Information

Download(Requires a free EAI acccount)
417 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-642-36642-0_32,
        author={Ngo Linh and Nguyen Thi Kim Anh and Cao Dat},
        title={Improving Vietnamese Web Page Classification by Combining Hybrid Feature Selection and Label Propagation with Link Information},
        proceedings={Context-Aware Systems and Applications. First International Conference, ICCASA 2012, Ho Chi Minh City, Vietnam, November 26-27, 2012, Revised Selected Papers},
        proceedings_a={ICCASA},
        year={2013},
        month={2},
        keywords={Feature Selection Label Propagation Web Classification Web Mining},
        doi={10.1007/978-3-642-36642-0_32}
    }
    
  • Ngo Linh
    Nguyen Thi Kim Anh
    Cao Dat
    Year: 2013
    Improving Vietnamese Web Page Classification by Combining Hybrid Feature Selection and Label Propagation with Link Information
    ICCASA
    Springer
    DOI: 10.1007/978-3-642-36642-0_32
Ngo Linh1, Nguyen Thi Kim Anh1,*, Cao Dat1,*
  • 1: Hanoi University of Science and Technology
*Contact email: anhnk@soict.hut.edu.vn, caomanhdat317@gmail.com

Abstract

Classification of web pages is essential to many information management and retrieval tasks such as maintaining web directories and focused crawling. One problem in web page classification is that, unlabeled training examples are readily available, while labeled ones are often costly to obtain. Furthermore, the uncontrolled nature of web content presents additional challenges to web page classification, whereas the interconnected characteristic of hypertext can provide useful information for the process. To address these problems, we propose a graph-based semi-supervised classification framework which combines iteratively hybrid semi-supervised feature selection and Label Propagation learning using link information to improve the Vietnamese web page classification. The experimental results show that proposed method outperforms the state-of-the art methods applying to Vietnamese web page classification.

Keywords
Feature Selection Label Propagation Web Classification Web Mining
Published
2013-02-04
http://dx.doi.org/10.1007/978-3-642-36642-0_32
Copyright © 2012–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL