Predicting the least air polluted path using the neural network approach

Air pollution exposure during daily transportation is becoming a critical issue worldwide due to its adverse effect on human health. Predicting the least air polluted healthier path is the best alternative way to mitigate personal air pollution exposure risk. Computing the least polluted path for the current time might not be helpful for real-time applications. Therefore, we develop a routing algorithm based on a neural network-based CNN-LSTM-EBK (CLE), a temporal-spatial interpolation model. The proposed model predicts pollution levels at high temporal granularity. This paper introduces a weight function to compute air pollution concentration at the road network. It also predicts the least air polluted path among all possible paths from a source to a destination at different time granularity. The results show that the predicted path may be longer than the shortest route but minimize pollution exposure risk all the time, which proves its effectiveness during daily transportation. Received on 06 April 2021; accepted on 19 June 2021; published on 29 June 2021


Introduction
The unpredictable development of urbanization in recent years has resulted in many undesirable environmental challenges. The unavoidable rising of ambient air quality level is one of them [1,2]. In past years, it is observed that a higher concentration of air pollution arises near the road network, which affects pedestrians, drivers, school children, older people, and patients during their travel. Traffic-related or vehicular-related air pollution is one of the main reasons behind this increasing level of outdoor air pollution and the deterioration of public health conditions. P M 10 (particulate matter with diameter 10 micrometers) is considered as the major harmful pollutant among all the pollutants due to its severe negative impact on the environment and human health [3]. It can affect drivers and pedestrians, causing many heart diseases, asthma, and respiratory problems [4]. People perform physical activities such as walking to improve their health condition, but sometimes it becomes worse due to the road network's unhealthy air pollution level. * Corresponding author. Email: 517cs6019@nitrklac.in In order to get rid of this problem, the ultimate solution is to predict the air pollution level in the road network and predict the least air pollution exposure path amongst all possible paths to provide an alternative safe route navigation system. Predicting the least air polluted path during walking, cycling, and driving can minimize pollution exposure risk during daily transportation. The first step in predicting the least air-contaminated path is the continuous pollutant concentration mapping. In order to get a proper surface mapping of air quality, meteorological and road network datasets can be collected and model a better routing solution. This method can be developed by performing temporal, spatial analysis of data. Subsequently, based on those temporal, spatial prediction models, a pollutant's continuous surface can be achieved. Air pollutant value prediction based on past observations data in the real-world scenario is challenging. Pollution level prediction at road networks and rank those paths according to their pollution level is one of the most difficult tasks for citizens due to the unavailability of accurate air pollution data at each geographical point. The limited and sparsely distributed air pollution monitoring stations are the main reason behind this issue. As air quality has a temporal and spatial dependency with the increasing air pollution level, it is challenging to analyze both the dependence and predict the air pollution level at the road network's unmeasured locations.
Predicting air pollution level includes several types of methods e.g., statistical [5][6][7][8], deterministic, numerical [9], geostatistics, machine learning [10], and deep learning techniques [11]. The existing techniques are not efficient enough to predict air pollution levels at the road network at different time granularity. Therefore, this paper proposes a deep learning-based interpolation methodology framework to mitigate the limitation of the existing prediction models and compute the road network's air pollution level at different temporal resolutions. This paper utilized the P M 10 pollutant concentration and open street map of Odisha to evaluate the proposed methodology framework's performance. Besides, the interpolated P M 10 value is treated as a cost function to predict the shortest path from starting to an endpoint.
Following the introduction, the rest of this paper is organized as follows: Section 2 formulates the problem which motivates this research work. Section 3 gives an idea about the literature survey. Section 4 and 5 explain the data source and methodology used for the proposed routing algorithm. Section 6 shows the experimental results, and Section 7 briefly describes the conclusion drawn from the experimental results.

Problem Formulation
Let G ∧ = (V , E) is the undirected graph, where the graph G is having N number of nodes V . Vertices (v i , v j ) ∈ V and edge (e ij ) ∈ E between two vertices is often used for spatially modelling the geographical data. Each air pollution monitoring stations V on G, has time series pollution data for time t with features f . These input features can be implemented in G as a feature matrix F t ∈ R N ×f . The road network feature f can be any pollutant value. The problem of spatial temporal analysis for road network is to develop a weight assignment mapping function w on road network G to predict f value in the next d days as mentioned below: Where n is the length of the historical time series dataset. The spatial relation among air pollution monitoring stations can be represented as an undirected graph G ∧ = (V , E, w). The next problem is to find the best alternative path R from source s to any destination h node on G ∧ = (V , E, w) which can be considered as the healthier path.

Literature survey
Many smart initiatives are making citizens' life much more comfortable and safer. One of the smart initiatives of the government is to improve the health conditions and wellness of people. Deterioration of air pollutants has become major trouble towards these initiatives. The government is taking necessary actions like personalized air pollution management, air quality monitoring, air quality modeling, and developing a smart transportation system with the minimum air pollution exposure risk to manage public health. Many IoT devices are installed at many locations, and pollution monitoring stations are established to perform all these activities. These devices have been installed to collect air pollution levels across many geo-locations to identify the pollution level at that particular location [4]. Still, those sensors are unable to predict air pollution levels on road networks. Existing personalized routing solution is based on user's preference like traffic condition [12], shortest [13], fastest [13][14][15], the least expensive path, lower crime [16][17][18] and lower accident-prone path [19] [20] but not on the air pollution exposure level. Nowadays, one has to be more concerned about adverse air pollution [21] and weather [22] condition impact on the road network for safer travel. Personalized routing solution, part of personalized pollution management, is a critical aspect of air pollution management. Exposure to air pollution for a longer time duration may cause severe health issues to older people, school children, and pregnant women [23]. Still, very few research has been conducted to develop a smart navigation system based on air pollution exposure risk [24][25][26][27].
This section reviews some state-of-the-art methods on safe route navigation that minimize air pollution exposure risk. Muller et al. conducted experiments using P M 10 emission data of Berlin [24], as it is a major indicator of pollution. The proposed method suggested air quality-adjusted shortest path for cyclists and pedestrians. Sharker et al. [26] proposed a framework, which provides a health optimal routing solution by computing optimal health weight for various scenarios. The author implemented the Bayesian modeling and influence diagram technique to design a weighting model considering individual and environmental variables and recommended a healthoptimal navigation service for the pedestrian. An IoTbased novel architecture is proposed in [27] to get the real-time routing solution. Air quality on the road network is of great importance to take precautionary measures for a healthier life, hence to make citizens aware of this, Ramos et al. [25] promotes pollution free routes based on the existing air quality sensor network. The technology-agnostic methodology framework is based on the air quality index interpolation and helps 2 EAI Endorsed Transactions on Scalable Information Systems Online First trace the city's air pollution-free routes. A study was conducted in Taiwan [28] to compute air pollution exposure levels during different transportation modes like walking, motorcycle, and bicycle transportation mode. Zahmatkesh et al. [29] proposed a geostatistical interpolation technique, i.e., kriging to generate air quality index spatial prediction map to prevent the air pollution zone and its polluted routes. All the existing work attempted to develop an intelligent navigation system is based on the spatial distribution [30] of past air pollution data, which practically does not significantly improve the quality of human health. Moreover, the existing work does not predict the air pollution level at the road network for the near future.
Therefore, it is better to develop an early warning navigation system, alerting people by predicting air pollution levels in the road network in advance. Based on the above survey, this paper developed an algorithm based on the temporal-spatial interpolation model to minimize personal air pollution exposure risk to improve public health conditions.

Data collection
The ambient air quality data of Odisha, India [31,32] is used to evaluate the proposed model. The dataset includes day-wise sampled P M 10 , P M 2.5 , SO 2 , N O 2 pollutant information in micrograms (one-millionth of a gram) per cubic from 2005 to 2015. The dataset is having 11 years of data for 30 monitoring sites of Odisha. Due to many missing values, 14 sites' information is dropped, and 16 monitoring sites' data sets are utilized for evaluation purposes. It also contains station code, sampling date, monitoring station name, type of location, longitude, and latitude value as attributes. The study area's road network graph and boundary are collected from the Open street map to evaluate the proposed algorithms. From the descriptive analysis of past observations, it is found that P M 10 is the major contributing air pollutant in Odisha [33][34][35], so utilized for experimental purposes to determine air pollution exposure risk in the study area.

Predicting the least P M 10 exposure risk path
In this section, the step-by-step proposed methodology is explained to predict the path with a minimum of P M 10 exposure. First, we integrate P M 10 concentration with its spatial feature and road network of Odisha. The proposed methodology combines both time and space dimensions simultaneously to perform both spatial and temporal analysis and generate a prediction map of the study area at each time instance. The temporal-spatial interpolation is required to assign an average P M 10 concentration value on the road network, which will be treated as a weight function to compute the shortest path from a source to a destination.

Temporal-Spatial air quality modeling
A person may take one or two days to travel from a source to a destination. Hence, we first require identifying how P M 10 concentration changes over time throughout the journey. To complete this task, we have to perform P M 10 prediction in the road network for the next day or next week. The proposed architecture, which is developed to achieve this objective, is presented in Figure 1.
Experiments and discussions of each part of the proposed architecture are explained in the following subsection.
Data normalization. Before completing the data analysis steps, it is required to normalize the data and remove the dataset's existing outliers. The existing outliers may affect the model's performance; hence Z-Score normalization is conducted in the preprocessing step, followed by the removal of the outlier. The equation for Z-Score normalization [36] is given below, where p i is the pollutant value, x represents mean and S represents the standard deviation. After performing the preprocessing step, outputs are utilized to perform temporal air quality modeling to get prediction value for the next few days.

Feature extraction.
A convolutional neural network (CNN) [37] is used to extract the features from preprocessed normalized historical pollution dataset and identify the relationship between the features that might influence the increasing air pollution level. Weight sharing and sparse connectivity features enable CNN to handle multiple layers and nonlinearity. It updates the weights, which contributes to the prediction results. Thus, it reduces the number of parameters and improves performance. Then, the maxpooling operation of stride two is conducted in input to decrease the convolution size and hence, increases the computational complexity. The schematic diagram of the convolutional neural network is represented in Figure 2.
Temporal modelling. Extracted features from the CNN layer fed into the long short-term memory (LSTM) [11] network to capture the long-term and short-term dependency of air quality for sequential modeling. A selective write, read and forget strategy followed in the LSTM model to deal with longer sequences to perform air pollution prediction at low and high time  granularity. The basic schematic diagram of the LSTM model is represented in Figure 3.
The goal of this LSTM basic unit is to compute s t . It can be mathematically formulated as, while computing s t , it will take past information s t−1 and current input information x t , in order to keep only relevant information. Instead of writing all parts of s t−1 , it can use a few parts of it for the next step, called selective write. o t is used to decide how much of the information should be retained and pass to the next step. As the output from selective write, it may not be interested in reading all this information, so it will be calculated with an input gate i t to perform a selective read operation. Then f t is used to forget the unnecessary input information. After computing the selective write, read, and forget gate, it will compute some states, and all these operations will be guided by the write, read, and forget operation. The computation of gates at time t can be represented mathematically as,

EAI Endorsed Transactions on Scalable Information Systems Online First
Predicting the least air polluted path using the neural network approach where s t is the temporary state, s t is the combination of current temporary value that computed in selective read and some forgotten part from the selective forget operation, h t signifies how much of the information need to write in the next stage, which will be controlled by output gate o t and the temporary state s t . After performing all the operations, the backpropagation algorithm will have 12 parameters that will be learned. Now one can compute the derivative of the loss function concerning each of those parameters to reduce the loss function.
Temporal-spatial interpolation prediction. Each road network has different features like different road lengths, widths, and different road segments. Due to unstructured monitoring stations, it is challenging to compute the P M 10 level at each road segment. In the past years, many works have been done to identify the spatial distribution of air pollution and estimate unknown point values in a particular study area using spatial interpolation techniques. Although these spatial interpolation techniques are widely used, many critical problems remain unsolved as the traditional spatial interpolation technique considers time and space individually. It may not be a satisfying method while predicting the spatial distribution in many practical applications at a different time granularity. Spatial interpolation for the next few days may add an advantage to make an efficient method that can alert people in advance to keep them safe. The most of the existing interpolation methods e.g. radial basis function (RBF) [38,39], ordinary kriging (OK) [40][41][42][43] and inverse distance weighting (IDW) [44,45] only predict the spatial distribution of air pollution for the current time not for the future duration. Thus, it is required to perform spatial interpolation based on temporal interpolation to predict interpolation results at each unsampled point and time instance.
All the spatial modeling techniques are based on the first law of geography, i.e., nearer points have similar characteristics having a stronger correlation, which includes IDW, kriging, and RBF interpolation techniques. IDW interpolation is a function of distance, which estimates the unsampled point values using a linear combination of sampling point values weighted by the inverse function of the distance from the interpolation point to the sampling points. While kriging is the weighted average of the measured data, which is based on the semivariogram. But these kriging methods are not so accurate due to poor standard error value, which also assumes the intrinsic stationarity, but unfortunately, air pollution data are not stationary. RBF, a deterministic method similar to kriging techniques but does not benefit from spatial analysis using a variogram. This technique is used to generate continuous surfaces using a large number of data points [46,47]. Unlike other interpolation techniques, the EBK (empirical Bayesian kriging) method does not require the prior distribution details. It can handle local and global data stationarity by providing faster performance with the default parameter setting [48]. This interpolation method is a different technique than any other existing traditional interpolation technique as it accounts for the error during the use of many semivariogram models instead of one. The conventional model uses one semivariogram model to adjust the parameters manually. The EBK method follows the subsetting and simulation process to adjust the parameters φ k and develop a better model automatically. It follows three main steps while getting interpolation results. First, it estimates a semivariogram model and parameters φ k using the observed data. Secondly, using this semivariogram and φ k , new values can be estimated by doing simulations at each input location M times. Thirdly, newly simulated inputs can be used for generating a new semivariogram model. Bayes' rule can be used to assign a weight w k to this newly estimated semivariogram model and new parameters φ k . It can be formulated using the following formulas: where weight is presented as w k , φ k is the simulated parameters, P is the observed pollutant value and f (P | φ k ) is the conditional probability of the observed pollutant value P given the model parameter φ k . It signifies how likely the observed P M 10 value can be generated from the estimated model. The second and third steps are repeated to estimate the semivariogram model in step one, which can be used to simulate a new set of values at each input point. These simulated data can be used to estimate the new semivariogram model and its weights. These weights (w k ) are used to generate predictions and prediction errors at each unsampled point. So, this interpolation can be an effective spatial interpolation technique for the road network. Hence, this research work experimented with the EBK model's effectiveness by adding it to the top of the CNN-LSTM model to get the prediction result for t + d days at the road networks. The prediction results of multiple nodes are used as input to the spatial interpolation layer to predict the spatial distribution of P M 10 , where the nodes' coordinates are treated as the boundary of the study implementation area. Thus, adding time series prediction results as input to this EBK interpolation layer yields better prediction results. Finally, the output of the proposed model generates a temporal-spatial interpolation prediction map of the study area. The Pseudocode of the proposed algorithm described above is represented in the Algorithm 1, where the first part of the algorithm performs temporal modeling and the second part perform the spatial modeling.
The temporal modeling is the most crucial step of P M 10 prediction because a journey may take more than one day. Sometimes, it requires predicting the P M 10 value for the next week or the next month to make the right move. The second part of the algorithm is responsible for spatial interpolation. As the air pollution monitoring stations are in unstructured locations like some might close to each other or far away. Therefore, EBK interpolation is used to assign the average P M 10 value as the weight on the road segment to predict each road segment's pollution level for the t + d days, depending upon the arrival time on the destination.

Application of the proposed model for safe navigation
The proposed CNN-LSTM-EBK (CLE) model predicts the P M 10 level at each location over the t + d days in the study area effectively. So, the proposed model is utilized for the application of the transportation system in a smart city to predict the healthiest safest path to the citizens. The temporal-spatial interpolation prediction map generated by the CLE model is used as input to propose a routing algorithm, which reduces P M 10 pollutant exposure risk. It may suggest a comparatively return pollution prediction surface on road network longer path than the shortest path but optimize both distance and pollution levels for safe travel. The existing navigation system usually uses time and distance as the weight to compute the shortest path. Few of the existing models compute the least air polluted path based on the polluted area [25][26][27][28][29]49] for the current time only, not for the different temporal resolution. Hence, to overcome this limitation, the proposed framework uses a deep learning approach to predict the safest path over the t + d days, which will be the least exposure to air pollution. The pseudo-code of the proposed routing solution is presented in Algorithm 2.

Parameter optimization
To demonstrate the proposed model and algorithm efficiency, four groups of models were experimented with using P M 10 concentration data, which consists of a series of day-wise sampling values. After the preprocessing step, the model was trained where 80% of data was utilized as the training set (January 2005-October 2015), 10% (November 2015) as the validation remaining 10%, i.e., last month data of 2015 as a test set. We used two-layer of CNN, two-layer of LSTM with 1078 iterations, and one layer of the, having 343 6 EAI Endorsed Transactions on Scalable Information Systems Online First Predicting the least air polluted path using the neural network approach  14: cos t(h, z) + d(h) ← final cost 15: return final cost 16: end for iterations for getting the best prediction value during temporal air quality modeling at the large temporal resolution to identify the most polluted location of Odisha over the next four weeks of December 2015. Furthermore, the learning rate set to 1e-3 and dropout is 0.2; Adam is used as an optimizer and Mean Square Error (MSE) as a loss function to train the model. The window size set to 50 and n-time steps is 28 to evaluate the model's long-term prediction performance.

Prediction performance comparison at large temporal resolution
Most of the existing air pollution prediction models predict air quality concentration levels for the next few hours for existing air pollution monitoring sites. Predicting air quality levels for the entire location can add profit to get optimum air quality prediction results for a considerable period. Generally, air quality prediction for an extended period has lower prediction accuracy than for a short duration. This issue might occur because of the small sample utilization for the long-term forecast. Hence, developing an air quality prediction model is required to perform air pollution prediction for the entire location and its road network for an extended period.
The first part of the proposed model has a neural network layer to perform air pollution prediction for all available monitoring stations only. Another vital property of the proposed method is that it has an EBK interpolation layer. The CLE model's execution provides prediction value at each point, where no monitoring stations are available. This is the most important feature of the proposed method. It can be utilized by researchers and policymakers to get the overall air quality level of a particular location and its road network at a different time. Though this approach can give vital information for public safety, very few models can predict air quality levels at unmeasured locations. By conducting temporal-spatial interpolation of P M 10 concentrations, it is possible to get a average P M 10 distribution surface at large time granularity. The cross-validation technique evaluates the temporal-spatial prediction performance of the CLE model. During cross-validation, the forecast value is predicted at an observed location by considering the other observations and removing the original observed forecast value. Then the observed and temporal-spatial prediction value is compared to calculate the prediction error. The model performance was evaluated using error metrics like Root Mean Square Error (RMSE) [50]. It can be represented as below, [51,52].
where y t is the observed pollutant value at time t,ŷ t presents predicted value at t and n is the total number of samples. The comparison table is shown in Table 1. The proposed CLE model is compared with baseline deep learning models to evaluate the model prediction efficiency for the next four weeks. It shows that the BILSTM model had the worst performance having the highest value of error metrics. Error metrics for BIGRU, CNN-BILSTM, CNN-BIGRU, CNN-GRU, LSTM, and GRU have reduced relatively. The comparative results show that the proposed model outperforms the other deep learning models in the case of longterm prediction due to the lower error metrics, 7 EAI Endorsed Transactions on Scalable Information Systems Online First as the lower the error metrics value signifies the higher prediction performance. The proposed model's temporal-spatial prediction performance is compared with the deterministic model, e.g. (IDW), machine learning model, e.g., RBF neural network, geostatistical techniques, e.g., spherical, exponential, and universal kriging as represented in Table 1. It can be seen from the comparison table that the CLE model has lower error metrics values than any other geostatistical, deterministic, and machine learning-based spatial prediction model due to its time and space dimension integration capability. This result proves that it is necessary to take both space and time dimensions simultaneously instead of only performing spatial interpolation to better interpolation results. Figure 4-5 shows the calculated P M 10 prediction map by CLE model at weekly and two day granularities. As shown in Figure 4-5, highly polluted areas located nearer to the eastern ghats and eastern coastal plains of Odisha. As the east part of Odisha has the most developed part of the state, we can see a high level of pollution due to high traffic emissions. So the government has always taken the necessary steps to improve these locations' air quality than any other province. It also noticed that the average P M 10 concentration ranges from 35 µg/m 3 to 121 µg/m 3 , which shows the unfavorable exposure level of P M 10 in Odisha.

Web Client to Trace temporal-spatial prediction of P M 10
The interpolated map is overlapped with the Google map using the ArcGIS [64] Online cloud server to design a prediction map web application. It is developed for visualization purposes, represented in Figure 6; it displays area size with its corresponding predicted class, minimum, and maximum pollution level. Web App Builder for ArcGIS service is utilized to build the web client application, and it has all facilities to create HTML/JAVA script featured web application. It also provides facilities to run the application online on its server.

Routing service to avoid polluted areas
When the web map layer displaying the prediction map is published, the computed polluted areas can be treated as input barriers while accessing a city's routes. ArcGIS Online network analysis feature is utilized to perform this operation. ArcGIS online routing service models driving and walking transportation modes. It provides facilities to compute routes based on the shortest distance and time from source to destination. It also allows us to consider traffic flow directions and restrictions. The network service routing layer enables treating the highly polluted area as input barriers; it prohibits all the intersecting paths from routing analysis and predicts the least air pollution exposure paths.
Previous research was conducted to compute the pollution-free route by performing network analysis over IDW [25][26][27][28] interpolated map. IDW interpolation can predict average, maximum, and minimum air pollution concentration for the current and past time only, not for the future. It can also be seen that the prediction performance of IDW (RMSE=24.07) is less as compared to the CLE model (RMSE=18.16) due to high prediction error, as shown in Table 1.
This research work performed network analysis over the temporal-spatial interpolation prediction map instead of spatial prediction map using ArcGIS online to minimize air pollution level and travel time simultaneously. Therefore, it can provide an optimized routing solution at a different temporal resolution, unique from any other route navigation system. ArcGIS, a geographical information system [65], is utilized to visualize routing solutions. An IoT-based platform helps to access real-time traffic data with each route detail and overlaps the air pollution interpolation map with the street-based map. There may be multiple paths for a given source and destination, and the user may select any of them based upon their preferences. User preference of start and endpoint can be updated with the help of this IoT-based framework, and routing solutions can be accessible by the web application.
To identify the proposed model's effectiveness, we evaluated its performance in terms of minimum pollution level and distance. Hence, we simulated the model by selecting 16 monitoring sites of Odisha and proposed the routing solution. The designed web application would help users make decisions more straightforward, as it can show both the shortest and the least polluted path and differentiate them.
The result of the optimized routing solution is compared with google map provided the shortest path, as shown in Figure 7. Figure 7 (a1) predicts the shortest path from the source to the destination point but passes through the highly polluted area, marked in red. It takes one hour 38 minutes for car drivers to cover the source to the destination point. At the same time, Figure 7 (b1) predicts the path for the next day (1st December 2015) for car drivers; that result is obtained from the proposed routing solution. It will take more time, i.e., 2 hours six minutes, to cover the source to the destination but pass through the least polluted area and predict a healthier path for traveling. Figure 8 predicts the shortest route and the least air polluted path for 2nd December 2015 for pedestrians. Figure 8 (c1) predicts the shortest path from Sophia junior college to Kendriya Vidyala, which needs 18 minutes to cross that path. In contrast, Figure 8 (d1) predicts the least polluted path, which will take four 8 EAI Endorsed Transactions on Scalable Information Systems Online First minutes more for a pedestrian to cross it. It shows the positive side of the proposed algorithm. The designed web application predicts air pollution level and also recommend pollution-free paths for Odisha. Based on the user preference, they can choose the path for their safe journey.

Conclusion
For the first time, the present study proposes a newly designed deep learning-based temporal-spatial interpolation method to provide an optimized routing solution for users that minimizes air pollution exposure risk. The current work demonstrates models that take P M 10 pollutant value into account to predict the healthier path in advance. The predicted least polluted path might be longer than the shortest path, but it avoids the pollution level that varies over space and time. The proposed method can be implemented in smart cities to a better quality of life by utilizing the temporal-spatial interpolation prediction map and proposed routing solution in advance. A comparative analysis is conducted, which shows the CLE model's effectiveness to recommend a healthier path for the next coming days.
The main finding of this research paper are briefly described as follows: • This paper integrated machine learning, deep learning, and spatial prediction techniques for air quality modeling at high temporal-spatial resolution. This could be essential information for public safety.
• The proposed model also provides effectively smaller prediction error at high temporal resolution than the baseline neural network models. It has almost 78%, 61% better air quality modeling performance than the ordinary BILSTM and GRU model, respectively.
• CLE model has a minimum of 21-29% better prediction performance at high temporal resolution than the traditional interpolation techniques. It can also solve the data imputation of geospatial air pollution data while conducting air quality prediction.
• This research paper utilizes the output of the proposed temporal-spatial interpolation model as input for GIS modeling, which prevents the highly polluted area as barriers to predict the healthier path at larger temporal resolution. So users can get overall air quality information on roads for the next four weeks.
• To the best of our knowledge, this is the first research study conducted in Odisha, India to predict air pollution levels and the safest path for the entire study area.
Still, this research work can further be improved. Meteorological factors, climate change, traffic emission, and rainfall have a direct and indirect impact on the 9 EAI Endorsed Transactions on Scalable Information Systems Online First ambient air pollution level, which should be analyzed during air quality modeling. We have not considered the meteorological and traffic impact on air pollution due to lack of data availability. 10 EAI Endorsed Transactions on Scalable Information Systems Online First  There is a further scope of using this technology by implementing this concept on a large area rather than limiting it to a particular confined space. Along with that, more parameters can be included, which will be much more beneficial to the users by considering the drastic impact of those pollutants on individuals.