First EAI International Conference on Computer Science and Engineering

Research Article

Unsupervised Text Feature Selection Technique Based on Particle Swarm Optimization Algorithm for Improving the Text Clustering

Download804 downloads
  • @INPROCEEDINGS{10.4108/eai.27-2-2017.152282,
        author={Laith Mohammad Abualigah and Ahamad Tajudin Khader and Mohammed Azmi AlBetar and Essam Said Hanandeh},
        title={Unsupervised Text Feature Selection Technique Based on Particle Swarm Optimization Algorithm for Improving the Text Clustering},
        proceedings={First EAI International Conference on Computer Science and Engineering},
        publisher={EAI},
        proceedings_a={COMPSE},
        year={2017},
        month={3},
        keywords={unsupervised feature selection informative features par-ticle swarm optimization algorithm K-mean text clustering technique},
        doi={10.4108/eai.27-2-2017.152282}
    }
    
  • Laith Mohammad Abualigah
    Ahamad Tajudin Khader
    Mohammed Azmi AlBetar
    Essam Said Hanandeh
    Year: 2017
    Unsupervised Text Feature Selection Technique Based on Particle Swarm Optimization Algorithm for Improving the Text Clustering
    COMPSE
    EAI
    DOI: 10.4108/eai.27-2-2017.152282
Laith Mohammad Abualigah1, Ahamad Tajudin Khader, Mohammed Azmi AlBetar, Essam Said Hanandeh
  • 1: School of Computer Sciences, Universiti Sains Malaysia (USM), Pulau Pinang, Malaysia 11800

Abstract

After incensing the amount of text information on internet web pages, the dealing with this information is very complex due to the volume of information. Text clustering technique is an appropriate task to deal with a huge amount of text documents by grouping set of documents into groups. Text documents contain uninformative features, which decrease the performance of the text clustering technique. Feature selection is an unsupervised technique used to select informative features by creating a new subset of informative features. This technique used to improve the performance of the underlying algorithm. Latterly, several complex optimization problems are success solved by metaheuristic al-gorithms. In this paper, we proposed the Particle swarm optimization algorithm to solve the feature selection problem, namely, (FSPSOTC). The feature selection technique encourages the k-mean text clustering technique to obtain more accurate clusters. Experiments were conducted using four standard benchmark text datasets with different characteris-tics. Experimental results showed that the proposed method (FSPSOTC) is enhanced the performance of the text clustering technique by dealing with a new subset of informative features.