Improving Classification of Tweets Using Linguistic Information from a Large External Corpus

Hugo Hammer; Anis Yazidi; Aleksander Bai; Paal Engelstad

Industrial Networks and Intelligent Systems. Second International Conference, INISCOM 2016, Leicester, UK, October 31 – November 1, 2016, Revised Selected Papers

Research Article

Improving Classification of Tweets Using Linguistic Information from a Large External Corpus

Download

203 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-319-52569-3_11,
    author={Hugo Hammer and Anis Yazidi and Aleksander Bai and Paal Engelstad},
    title={Improving Classification of Tweets Using Linguistic Information from a Large External Corpus},
    proceedings={Industrial Networks and Intelligent Systems. Second International Conference, INISCOM 2016, Leicester, UK, October 31 -- November 1, 2016, Revised Selected Papers},
    proceedings_a={INISCOM},
    year={2017},
    month={6},
    keywords={Classification Co-occurrence information Text mining Tweets},
    doi={10.1007/978-3-319-52569-3_11}
}

Hugo Hammer
Anis Yazidi
Aleksander Bai
Paal Engelstad
Year: 2017
Improving Classification of Tweets Using Linguistic Information from a Large External Corpus
INISCOM
Springer
DOI: 10.1007/978-3-319-52569-3_11

Hugo Hammer¹^,*, Anis Yazidi¹^,*, Aleksander Bai¹^,*, Paal Engelstad¹^,*

1: Oslo and Akershus University College of Applied Sciences

*Contact email: hugo.hammer@hioa.no, anis.yazidi@hioa.no, aleksander.bai@hioa.no, paal.engelstad@hioa.no

Abstract

The bag of words representation of documents is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. Improvements might be achieved by expanding the vocabulary with other relevant word, like synonyms.

Keywords: Classification Co-occurrence information Text mining Tweets

Published: 2017-06-05
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-319-52569-3_11