Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data

Eslam Hussein; Ronewa Sadiki; Yahlieel Jafta; Muhammad Sungay; Olasupo Ajayi; Antoine Bagula

e-Infrastructure and e-Services for Developing Countries. 11th EAI International Conference, AFRICOMM 2019, Porto-Novo, Benin, December 3–4, 2019, Proceedings

Research Article

Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data

Download

769 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-41593-8_13,
    author={Eslam Hussein and Ronewa Sadiki and Yahlieel Jafta and Muhammad Sungay and Olasupo Ajayi and Antoine Bagula},
    title={Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data},
    proceedings={e-Infrastructure and e-Services for Developing Countries. 11th EAI International Conference, AFRICOMM 2019, Porto-Novo, Benin, December 3--4, 2019, Proceedings},
    proceedings_a={AFRICOMM},
    year={2020},
    month={2},
    keywords={Hadoop MapReduce Spark Hive Meteorology Big data},
    doi={10.1007/978-3-030-41593-8_13}
}

Eslam Hussein
Ronewa Sadiki
Yahlieel Jafta
Muhammad Sungay
Olasupo Ajayi
Antoine Bagula
Year: 2020
Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
AFRICOMM
Springer
DOI: 10.1007/978-3-030-41593-8_13

Eslam Hussein¹, Ronewa Sadiki¹, Yahlieel Jafta¹, Muhammad Sungay¹, Olasupo Ajayi, Antoine Bagula^,*

1: University of the Western Cape

*Contact email: abagula@uwc.ac.za

Abstract

Meteorology is a branch of science which can be leveraged to gain useful insight into many phenomenon that have significant impacts on our daily lives such as weather precipitation, cyclones, thunderstorms, climate change. It is a highly data-driven field that involves large datasets of images captured from both radar and satellite, thus requiring efficient technologies for storing, processing and data mining to find hidden patterns in these datasets. Different big data tools and ecosystems, most of them integrating Hadoop and Spark, have been designed to address big data issues. However, despite its importance, only few works have been done on the application of these tools and ecosystems for solving meteorology issues. This paper proposes and evaluate the performance of a precipitation data processing system that builds upon the Cloudera ecosystem to analyse large datasets of images as a classification problem. The system can be used as a replacement to machine learning techniques when the classification problem consists of finding zones of high, moderate and low precipitations in satellite images.

Keywords: Hadoop, MapReduce, Spark, Hive, Meteorology, Big data

Published: 2020-02-14
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-41593-8_13

Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data

Abstract

About EAI

Community

Publish with EAI