Cloud Infrastructures, Services, and IoT Systems for Smart Cities. Second EAI International Conference, IISSC 2017 and CN4IoT 2017, Brindisi, Italy, April 20–21, 2017, Proceedings

Research Article

An Analysis of Social Data Credibility for Services Systems in Smart Cities – Credibility Assessment and Classification of Tweets

  • @INPROCEEDINGS{10.1007/978-3-319-67636-4_14,
        author={Iman Abu Hashish and Gianmario Motta and Tianyi Ma and Kaixu Liu},
        title={An Analysis of Social Data Credibility for Services Systems in Smart Cities -- Credibility Assessment and Classification of Tweets},
        proceedings={Cloud Infrastructures, Services, and IoT Systems for Smart Cities. Second EAI International Conference, IISSC 2017 and CN4IoT 2017, Brindisi, Italy, April 20--21, 2017, Proceedings},
        proceedings_a={IISSC \& CN4IOT},
        year={2017},
        month={11},
        keywords={Smart cities Smart citizens Social data Twitter Twitter bot Credibility Veracity Classification Social media mining Machine learning},
        doi={10.1007/978-3-319-67636-4_14}
    }
    
  • Iman Abu Hashish
    Gianmario Motta
    Tianyi Ma
    Kaixu Liu
    Year: 2017
    An Analysis of Social Data Credibility for Services Systems in Smart Cities – Credibility Assessment and Classification of Tweets
    IISSC & CN4IOT
    Springer
    DOI: 10.1007/978-3-319-67636-4_14
Iman Abu Hashish1,*, Gianmario Motta1,*, Tianyi Ma1,*, Kaixu Liu1,*
  • 1: University of Pavia
*Contact email: imanhishamjami.abuhashish01@universitadipavia.it, motta05@unipv.it, tianyi.ma01@universitadipavia.it, kaixu.liu01@universitadipavia.it

Abstract

In the “Information Age”, Smart Cities rely on a wide range of different data sources. Among them, social networks can play a big role, if information veracity is assessed. Veracity assessment has been, and is, a rather popular research field. Specifically, our work investigates the credibility of data from Twitter, an online social network and a news media, by considering not only credibility, and type, but also origin. Our analysis proceeds in four phases: Features Extraction, Features Analysis, Features Selection, and Classification. Finally, we classify whether a Tweet is credible or incredible, is rumor or spam, is generated by a human or a Bot. We use Social Media Mining and Machine Learning techniques. Our analysis reaches an overall accuracy higher than the benchmark, and it adds the origin dimension to the credibility analysis method.