Air Quality Monitoring Systems with Multiple Data Sources for Ho Chi Minh City

In this paper, we present our proposed air quality monitoring system with multiple data sources for smart cities. We deploy our system in one of the biggest cities in Vietnam, Ho Chi Minh City. The proposed system uses data collected by our sensors and extracted from remote sensing images. The system also allows users to contribute by provide alerts through a portal. With data collected from sensors, we can provide exact values of fundamental parameters for calculating air quality index (AQI) while data extracted from remote sensing images help governors estimate the AQI values in surrounding areas without sensors deployed. This estimation although cannot provide exact information as sensors, it helps us to quickly understand AQI in an extremely large area with low cost. Along with these data sources, notifications from users also allow governors to react unawareness problems faster. Experimental results show the error (difference) between our systems and commercial devices is less than 24% for sensoring system and less than 9% for remote sensing images estimation. The sensoring system presented in this paper is low-energy consumption when using only 900mW in average.


Introduction
According to World Health Organization, there are about 12.6 millions of people died due to unhealthy environment [1]. In Vietnam, this number is about 52,000 people per year. Therefore, air quality monitoring systems, especially in big cities, have become essential demands due to the high number of inhabitants as well as low air quality index compared to other cities or rural areas.
Although big cities suffer from low quality of air, smart cities provide many benefits for people. According to the Smart City Tracker report, as of Q3/2017 [2] there exist more than 400 smart cities in the world. Along with benefits from smart cities like smart transportation or technology-based applications, inhabitants in smart and big cities need to have a better AQI for their healthy. Therefore, a smart air quality monitoring system that can collect multiple sources of data is an essential demand. The air quality monitoring system can help governors quickly react to air quality hazardous issues to bring a pleasure environment back to inhabitants in the city.
Many countries in the worlds have built and operated air monitoring and air quality forecasting systems. These systems can collect and predict levels of harmful gases such as O3, NO2, or CO and fine particulate matter (PM2.5) [4][5] [6]. These parameters are used to announce early alerts to governors and local inhabitants so that they can have reactions to adapt these harmful issues such as idling factories, prohibiting vehicles in the city centers or using green vehicles [7].
In Ho Chi Minh city (HCMC), although this is the biggest city in Vietnam with the highest population, there exist only a few of fixed air quality monitoring stations. According to these systems, the PM2.5 level in HCMC rises to the unhealthy level (25 µg/m 3 ) [3]. Therefore, to provide people in HCMC a better life, air quality monitoring systems with multiple data sources as well as multiple parameters collected such as CO, CO2, temperature, or humidity level should be investigated.
In this paper, we present our hazardous monitoring system that uses multiple data sources including sensors, remote sensing images, and notifications from inhabitants to build an air quality monitoring system. The system can provide early alerts to both governors and inhabitants in HCMC to react quickly. Data collected in the system can be used for forecasting the AQI in the future.
The rest of the paper is organized as follows. Related work is presented in Section 2. Section 3 presents an overview of our system. We introduce our sensor nodes to sensing data as well as our remote image processing engine in Section 4. We provide values collected from the system and compare with values measured by commercial devices in Section 5. In section 6, we conclude our paper.

Related work
In both the literature and industry, there exist a number of air quality monitoring systems. However, these systems use either sensors for collecting parameters of AQI or remote sensing images for estimating AQI. In this work, we combine both techniques for our system so that they can compensate each other.
For the first approach, using sensors for calculating AQI, along with systems mentioned above, other systems can be counted as follows. One of the earlies systems in the world is GEMS/AIR -Global Environment Monitoring System built in Sweden for monitoring air quality as well as climate change [8]. According to the paper in of Kenneth L. Demerjian, in north America, there exist more than 4.000 monitoring stations controlled by the USA, Canada and Mexico for air quality sensing [8] In Asia countries, there also includes similar systems for monitoring AQI such as reported in [10] [11].
For the second approach, using remote sensing image for estimating land surface temperature and PM2. 5 [19]. However, estimation of land surface temperature and fine particulate matter depends on current situation of the research locations such as building, grass, forest or lake. Therefore, we did our research for Ho Chi Minh city to build an estimation tool for these parameters reported in our previous work [20]. In this paper, we present the entire system where we combine both the above approach.

System overview
In this section, we present an overview of our system. Figure 1 illustrates the proposed system to monitor air quality for big cities using multiple data sources. As depicted in the figure, the system supports data from four different sources including: These sensors collect exactly values for different parameters at the sensor deployment positions. These values can be used to provide to governors and inhabitants as well as used to building estimation models for remote sensing images. • Server for remote sensing images processing: automatically collecting and processing remote sensing images from satellites such as Landsat or MODIS. The results of this process are estimated values for PM2.5 and land surface temperature at locations inside the city. This engine is useful to estimate harmful parameters at locations without sensors. More details of the estimation models are presented in our previous work [20]. • GUI and portal: providing a portal to allow users to access data and notify any hazardous situations they can monitor manually. Of course, feedback from users needs to be double checked and granted by administrators or governors to become publicity. • Database systems: storing raw and preprocesing data collected from three other sources for future purposes such as AI-based air pollution forecasting.
Finally, a cloud-based service is deployed to process data from different sources. However, due to different data structures of data sources (in data size, types, time stamp…), we need to build application programming interfaces (API) for integrating multiple data sources. Figure 2 illustrates the API architecture developed in the systems. These APIs follow the HTTP and REST standards so that other developers can easily connect to the system to develop other related applications.

System implementation
In this section, we present implementation of our system based on the proposed architecture in the previous section. We mainly focus and discuss the sensoring systems and the cloud service.

Sensoring systems
In this work, we build our sensoring system with sensor nodes and gateway nodes. Sensor nodes directly collect different monitored values including levels of CO, CO2, temperature, PM2.5, and humidity. These values are sent directly to a gateway node via radio frequency (RF) signals, more precisely Long-Range RF (LoRa). Meanwhile, a gateway node collects data from a number of sensor nodes and sends to the cloud computing services through the internet (4G or wifi). This architecture reduces dependency to internet services compared to current monitoring systems in HCMC. Thanks to the LoRa technology, the distance between sensor nodes and the corresponding gateway node can be a few of kilometers. To secure the communicate channel between the gateway node and the sensor nodes, we use AES128 [21] bit encoding for data communication.
To build the sensor nodes, we design and implement our boards which are equipped with a micro-controller, a LoRa module, sensors, and a power supply. Figure 3 presents the architecture of our sensor nodes. Figure 4 shows the printed circuit board layout of our sensor nodes. Along with the sensor nodes, we also develop our gateway nodes with our self-design mainboard. The main functionality of the gateway nodes is to collect data from surrounding sensors nodes and send to the cloud computing services through the internet. Besides, to secure communication channels, AES encoding should be used. In our work, we use AES128 bit for encoding communication between the sensor nodes and the gateway nodes while AES256 for encoding communication between the gateway nodes and the cloud services. To design and implement the mainboard, we choose the RK3128 processor with 4 ARM Cortex-A7MP Cores that can function at up to 1.3 GHz. Table 1 presents the main characteristics of the RK3128.

Power
Input Voltage 5V, Peak Current 2A Figure 5 presents the architecture of the mainboard for gateway nodes. The RK3128 processor supports different I/O connections that can be extended for other applications in the future. Figure 6 shows the layout of printed circuit board of the gateway node. Currently, the price for building a sensor node in this project is much cheaper than price of similar commercial products such as the Libelium-Aerostate AQI Solution Kit [22].

Remote sensing images processing
In this work, along with data collected by sensors, we use remote sensing images (Landsat [22] and MODIS [23] images) for estimating AQI values (temperature and PM2.5) at areas without sensors. To build precise estimation models, values collected by sensors are used to calibrate values extracted from images. The processing steps as well as mathematic models to estimate values of land surface temperature and levels of PM2.5 are presented in our previous work [20].

Cloud services
The cloud services can be considered as the heart of the system when they connect all subsystems together. Figure 7 presents the three layers architecture of our cloud services.
• Data layer: including sensoring and remote sensing images systems to collect air quality monitoring data. A service passively receives data sent from gateway nodes with values of CO, CO2, PM2.5, temperature, humidity. The other service connects to the remote sensing server to collect values of PM2.5 and land surface temperature extracted from remote sensing images. • Cloud layer: functioning as the center management of the cloud services. Values collected from sensors and remote sensing images are processed at this layer to store in databases as well as to provide to users or governors. The layer is also responsible for managing users and monitoring the values to make alerts when needed. • Application layer: including graphic user interfaces that allow users to access information processed by the system. The interfaces also help governors to handle the entire system through our configuration page where phone number and threshold of values can be configured. We also provide a portal at this layer to allow users to submit warning information they aware at their locations. The information then will be examined by administrators or governors before making pubilicity for inhabitants. the system to process real-time event-oriented behaviors that are suitable for our air quality monitoring system.
The database system processing is designed according to MVC model (Model -View -Controller). The model helps the procesing engine be mantained and expanded easily. The model partitions operations of the database systems processing into three different components, including Model, View, and Controller.
Model interacts with database management system (DBMS) to process and retrieve data. Model is executed under the controls of Controller. View is responsible for receiving and displaying information and interacting with users.

System validation
To validate the proposed system, we conduct a number of tests to check values monitored by our sensoring systems and other commercial devices. Tests to compare values extracted by the remote image processing system and values collected by our sensoring system are also executed. In this section, we present the results of these validations.

Validation the sensoring systems
To validate our sensoring systems, we compare values collected from our sensor nodes and values measured by Tenmars commericial devices [24]. These devices are certified already and exported to many countries around the world. We collect data with different senarios including peak hours.

Validate the remote sensing images processing
To verify the remote sensing images processing, we compare values estimated by from remote sensing images and values collected by our sensoring systems. Table 3 and Table 4 illustrate the errors of PM2.5 and land surface temperature comparison between the two methods. According to Table 3, we achieved at most 7.69% error for PM2.5 levels estimation. Error of land surface temperature values is at most 8.77%. With these errors, our remote sensing images processing can be used for estimating values of PM2.5 and land surface temperature in locations without sensors.

Energy consumption measurement
The model in Figure 9 is used to measure the current and the voltage used for our sensoring system. In this model, "Load" means our system. We connect a one-ohm resistor serially with our system to calculate the current used by the system. Oscilloscope is used to measure voltage between A and B then B and C. We supply a power of 5V for the entire system (UAC = 5V). Figure 9. Current used measurement model Table 5 presents the values of current, and voltage used for our system in 5 different measurement times. According to the table, in average, we need the current of 176 mA and the power supply of 4.88V for the proposed system. Therefore, the system consumes in average 900mV.
Compared to other systems reported in the literature, our use is low-energy consumption.

Conclusion
In this paper, we introduce our air quality monitor system with multiple data sources that can be applied in smart cities like Ho Chi Minh City. We use data collected from sensors and extracted from remote sensing images to monitor the quality index. We build our sensoring systems with sensor nodes and gateway nodes. Based on values collected from sensors, we calibrate the remote sensing images processing models to estimate values of fine particulate matter and land surface temperature. The models can be used to evaluate air quality index in locations where we have not yet deployed sensoring systems. To combine all components to a complete system, we build cloud services including APIs for collecting data as well as database for storing values. The cloud services also provide a graphic user interface portal for users and governors to explore the air quality index. The system is able to receive notifications from users when they aware any hazardous situations at their places. When compared with other commercial devices, values measured by our systems are different up to 24% at most for sensoring while 9% at most with remote sensing images estimation.
The proposed system is low-energy consumption when using only 900mW in average.