Industrial Networks and Intelligent Systems. Second International Conference, INISCOM 2016, Leicester, UK, October 31 – November 1, 2016, Revised Selected Papers

Research Article

Improving Classification of Tweets Using Linguistic Information from a Large External Corpus

  • @INPROCEEDINGS{10.1007/978-3-319-52569-3_11,
        author={Hugo Hammer and Anis Yazidi and Aleksander Bai and Paal Engelstad},
        title={Improving Classification of Tweets Using Linguistic Information from a Large External Corpus},
        proceedings={Industrial Networks and Intelligent Systems. Second International Conference, INISCOM 2016, Leicester, UK, October 31 -- November 1, 2016, Revised Selected Papers},
        proceedings_a={INISCOM},
        year={2017},
        month={6},
        keywords={Classification Co-occurrence information Text mining Tweets},
        doi={10.1007/978-3-319-52569-3_11}
    }
    
  • Hugo Hammer
    Anis Yazidi
    Aleksander Bai
    Paal Engelstad
    Year: 2017
    Improving Classification of Tweets Using Linguistic Information from a Large External Corpus
    INISCOM
    Springer
    DOI: 10.1007/978-3-319-52569-3_11
Hugo Hammer1,*, Anis Yazidi1,*, Aleksander Bai1,*, Paal Engelstad1,*
  • 1: Oslo and Akershus University College of Applied Sciences
*Contact email: hugo.hammer@hioa.no, anis.yazidi@hioa.no, aleksander.bai@hioa.no, paal.engelstad@hioa.no

Abstract

The bag of words representation of documents is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. Improvements might be achieved by expanding the vocabulary with other relevant word, like synonyms.