First EAI International Conference on Computer Science and Engineering

Research Article

Text Segmentation for Analysing Different Languages

Download1629 downloads
  • @INPROCEEDINGS{10.4108/eai.27-2-2017.152280,
        author={Irina Pak and Phoey Lee Teh},
        title={Text Segmentation for Analysing Different Languages},
        proceedings={First EAI International Conference on Computer Science and Engineering},
        publisher={EAI},
        proceedings_a={COMPSE},
        year={2017},
        month={3},
        keywords={Text Segmentation Text Analysis Text Processing Languages Online Reviews Opinion Mining},
        doi={10.4108/eai.27-2-2017.152280}
    }
    
  • Irina Pak
    Phoey Lee Teh
    Year: 2017
    Text Segmentation for Analysing Different Languages
    COMPSE
    EAI
    DOI: 10.4108/eai.27-2-2017.152280
Irina Pak1,*, Phoey Lee Teh
  • 1: Department of Computing and Information Systems, Sunway University, Bandar Sunway, Malaysia
*Contact email: irina.p@imail.sunway.edu.my

Abstract

Over the past several years, researchers have applied different methods of text segmentation. Text segmentation is defined as a method of splitting a document into smaller segments, assuming with its own relevant meaning. Those segments can be classified into the tag, word, sentence, topic, phrase and any information unit. Firstly, this study reviews the different types of text segmentation methods used in different types of documentation, and later discusses the various reasons for utilising it in opinion mining. The main contribution of this study includes a summarisation of research papers from the past 10 years that applied text segmentation as their main approach in text analysing. Results show that word segmentation was successfully and widely used for processing different languages.