A fireworks algorithm based Pi-Sigma neural network (FWA-PSNN) for modelling and forecasting chaotic crude oil price time series

Capturing the complex correlations among the data on the crude oil price time series is challenging, hence accurate prediction of it is difficult. Contrast to multilayer artificial neural network, Pi-Sigma neural network (PSNN) is characterized with stronger approximation ability, fast learning and higher fault tolerance. Fireworks algorithm (FWA) is a new metaheuristic motivated from the occurrence of fireworks explosion, characterized with fast convergence capability, parallelism and finds global optima. This article intends to achieve the synergetic effect of both on hybridizing them. It uses FWA for optimization of weight and bias of PSNN (FWA-PSNN) and overcomes the limitations of gradient based training. It has single hidden layer structure and trainable weight, hence fast and robust. The FWA-PSNN is evaluated on prediction of crude oil price time series. Extensive simulation results, performance analysis, and statistical significance test suggested the suitability of FWA-PSNN.


Introduction
Crude oil price plays a major function in the economical growth of a country.During last few decades the crude oil price has been facing arbitrary changes in its movement.It depends upon multiple socioeconomically as well as political factors.The random fluctuations and high nonlinearities associated with the crude oil price time series make its forecasting difficult.Nominal change that occurs to the crude oil price gives impact to the petroleum price, crude oil products and the global economy.The volatility in the price series is generally affected by factors like population, demand and supply, political climates, international relationship [1].Thus, a correct and proficient prediction instrument is crucial for crude oil price forecast.Advances in computational intelligence techniques including artificial neural networks (ANNs) are used as better alternative to this domain [2].ANN-based models are applied successfully to forecast the crude oil price [3 -6].ANN has the resemblance with the judgment power of human intellect and mimicking it [7 -9].The ANN can replicate the process of human actions to crack nonlinear problems.This characteristic made it popular to be used in resolving complex systems.ANNs are measured to be valuable modelling process to map input-output relationship where the data contains both regularities and exceptions such as financial time series.These facts of ANN attract Sarat Chandra Nayak financial stake holders to forecast financial time series with ANN based models.Dealing with uncertainty and nonlinearity allied with financial data with ANN-based forecasting method primarily involves identification of patterns in the data and use of such patterns to forecast.Multilayer neural networks have certain issues like adding more computational complexity, leads to lower learning rate and accuracy.They are characterized with structural complexity, computational overhead and have black box visualization.In contrast to them, higher order neural networks (HONN) have certain characteristics such as stronger approximation capability, speedy learning properties, larger storage and higher fault tolerance capability.Also, they have a prevailing mapping single layer tunable weight set that overcomes the black box nature [10].Use of higher order terms can amplify the information capacity and help solving intricate nonlinear task even with a petite network without compromising the convergence capability.In the other hand, with the swell in network order, there is intensification in tunable weights which may lead to additional computation cost.To alleviate these drawbacks, another kind of HONN, i.e.PSNN is introduced in [11] which use a lesser amount of connection weights.Few applications of HONN in predicting financial time series are proposed by authors in [12 -16].The adjustment of neuron weight and bias of ANN is the key factor of ANN training and requires frequent human interventions.The performance of ANN solely depends upon the adjustment of weight and bias vectors.To circumvent the limitations of gradient descent based ANN training, large number of nature and bio-inspired optimization techniques are proposed and applied [17].Evolutionary computing techniques are based on the behaviour of nature.Usually, these algorithms are inspired from theory of evolution and termed as evolutionary optimization algorithms or metaheuristic.The ideas of imitating concepts from nature have great potential in developing algorithms to solve engineering problems.In recent past, applications of these techniques have achieved popularity in wide area of engineering, computer science, medicine, economics, finance, social networks and so on.More often, performance of these techniques depends on proper adjustment of several algorithm specific control parameters.There is no distinct technique performing well on all problems.Evolutionary learning techniques such as GA [18], PSO [19], DE [20], Ant colony optimization [21], FWA [22] etc. are more proficient than gradient descent based methods in searching the optimal solution.FWA is a newly suggested metaheuristic which simulates the occurrence of fireworks explosion at night [22].Like other nature inspired optimization it is also a populationbased evolutionary algorithm.It tries to find the best fit solution in the search space through the explosion of fireworks.Several applications of FWA are found in the literature for solving real data mining problems.Mean time, there are some improved and enhanced version of FWA proposed and their superiority have been established.However its application toward crude oil price time series is limited.Analysis and improvement of fireworks algorithm is suggested by authors in [23].A cooperative framework for FWA is suggested in [24].An enhanced version of FWA is proposed and evaluated on standard benchmark functions [25].Adaptive FWA and dynamic search in FWA are also introduced by authors in [26 -27].The intention of current work is to investigate the potential of FWA based PSNN (FWA-PSNN) hybrid model on forecasting the crude oil price time series.The hybrid model employed FWA to optimize the weight and bias vector of PSNN.Each location (individual) of FWA can be viewed as a possible weight and bias vector for a PSNN.The FWA applies local as well as global search techniques in the form of fireworks explosion to explore the best possible weight and bias vector in the potential search space.The proposed hybrid FWA-PSNN model is evaluated on forecasting crude oil price time series.The performance of the model is compared with that of four other models such as PSO-PSNN, GA-PSNN, gradient descent based PSNN (GD-PSNN) and MLP trained similarly.The major advantages of the proposed FWA-PSNN based forecasting model are as follows.
 The article is structured in five major sections.Section 1 gives a short description about the crude oil price forecasting, ANN and FWA.An overview of FWA is discussed in Section 2. The FWA-PSNN based forecasting is presented in Section 3. Section 4 summarizes the experimental outcome and analysis.Section 5 gives the concluding remark followed by a relevant list of references.

Related studies
Crude oil is a precious commodity for economical and industrial development of a country as well as for global economy.Several related and recent studies have been carried out in the domain of crude oil forecasting which includes both linear as well as nonlinear models.The stochastic and nonlinearity behaviour of crude oil prices has been demonstrated in the work [28].The authors compared statistical and ANN based models and concluded that ANN models produces better forecast.Traditional approaches including linear and parametric models are found not efficient in capturing the dynamics of financial time series [29,30].
The first category of forecasting model includes statistical models like autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), generalized autoregressive conditional heteroscedasticity (GARCH) and their hybridization.Many authors proposed this approach for financial and oil market forecasting.Gavriilidis et al. studied the effect of inclusion of oil price shocks from different origin in a set of GARCH-X models [31].Herrera et al. evaluated the relative performance of various econometrics models using high frequency intra-day volatility data [32].A hybrid metaheuristic approach based on ANN, ARIMA, and support vector machines (SVM) was proposed by Naderi et al. [33].A nonlinear metabolic grey model corrected by ARIMA (NMGM-ARIMA) for enhanced accuracy in China's foreign oil dependence has been demonstrated by Wang et al. [34].
The second category includes artificial intelligence, machine learning, soft computing, and hybridization of them.Huang and Wu [35] developed deep multiple kernel learning approach for energy commodity price forecasting.Hybrid linear and non-linear techniques for energy demand forecasting in China and India are proposed by authors in [36].To study the nonlinear complex nature of crude oil price movement Chen et al. [37] proposed a deep learning based model and achieved improved forecasting accuracy.A GA and fast ensemble empirical mode decomposition (GA-FEEMD) for forecasting crude oil price time series data has been proposed by authors in [38].They compared the proposed model with ARIMA, and ANN found improvement.A PSO optimized gray Markov model was suggested by Hu et al. [39] and found smaller error and better accuracy compared to others.A hybrid PSO and radial basis functional neural network is proposed in [40] that outperformed other models.A forecasting model based on Grey Wolf Optimizer is proposed for short term forecasting of energy commodity [41].The model has been compared with model trained with artificial bee colony optimization and differential evolution and found more competitive.A combination of kernel principal component analysis and DE optimized support vector machine is proposed for crude oil price prediction [42].A survey on computational methods for crude oil price forecasting has been done by authors in [43].During last decades several nature-inspired metaheuristic algorithms as well as machine learning techniques have shown their capability in application areas like healthcare [44,46], medical image processing [47], query optimization [48], crude oil price forecasting [49], optimal feature selection and classification [50].These metaheuristics are population based and able to land at global optima with a reasonable computational cost.FWA is a recent optimization technique which tries to find the best fit solution in the search space through the explosion of fireworks.It has characteristics like fast convergence and reaching global optima.Analysis and improvement of FWA is discussed by authors in [23].A cooperative framework for fireworks algorithm is proposed by authors in [24].Enhanced version of fireworks algorithm is proposed and evaluated on standard benchmark functions by the authors in [25].Adaptive FWA and dynamic search in FWA are also introduced by authors in [26 -27].An icing forecasting system based on FWA and weighted least square support vector machine is proposed in [44] and the result show that the model obtained high prediction accuracy.FWA is also applied for portfolio optimization and found better to other swarm intelligence methods [45].FWA and adjusted FWA applications for multilevel image threshold and retinal image registration are found in [46,47].

Methods
This section describes about the metaheuristic, i.e.FWA used in this article for optimization of weight and bias vector of the PSNN followed by the proposed FWA-PSNN based forecasting.The mathematical detail of FWA is beyond the range of this article.The base articles are cited wherever required.

FWA metaheuristic
Fireworks algorithm is a recently proposed optimization technique which mimics the explosion process of fireworks [22].It tries to select certain amount of locations in a search space for explosion of fireworks to produce set of sparks.Locations with qualitative fireworks are considered for the subsequent generation.The procedure continues iteratively up to a desired optimum or reaching the stopping condition.The process mainly comprises three steps: setting off N fireworks at N selected locations, obtaining the locations of sparks after explosion and evaluating them, stop on reaching optimal location or select N other locations for the next generation of explosion.An explosion of fireworks can be visualized as a search process in the local space.According to the basic FWA, for each firework x i , the amplitude of explosion (A i ) and number of sparks (s i ) are defined as follows: where A ̂ is the maximum amplitude of explosion.f max is the maximum of objective function value and f min is the minimum objective function values among the p fireworks.The m is a controlling parameter for total number of sparks generated by a firework and ε is a constant used to avoid zero division error.Bounds are imposed on s i to overcome the devastating effects of marvelous fireworks as follows: The location of each spark x j generated by x i is calculated by setting z directions randomly and for each dimension k A fireworks algorithm based Pi-Sigma neural network (FWA-PSNN) for modelling and forecasting chaotic crude oil price time series EAI Endorsed Transactions on Energy Web Online First setting the component x j k based on x i k , where 1 ≤ j ≤ s i , 1 ≤ k ≤ z.
The setting of x j k can be done in two ways as follows:  For most sparks, a displacement is added to x j k as:  For maintaining the diversity, for few specific sparks, an explosion coefficient based on Gaussian distribution is applied to x j k as: A new location is mapped to the potential space when it falls out of the search space as: where % is the modulo operator.
The next step is selection of another N location for the fireworks explosion.This step always keeps the current best location x * for the next generation.Remaining N-1 locations are considered on the basis of their distance to other locations.The distance between a location x i and other locations (K) can be calculated as the sum of Euclidean distance between them and as follows: A location is x i selected for the next generation based on a probability value as follows: Since the sparks suffer from the power of explosion, they move along z directions simultaneously.This makes FWA to achieve faster convergence.Also, it avoids the premature convergence with the two types of spark generation methods and specific location selection method [22].The advantages of FWA over standard PSO and its improved variants are demonstrated in the research work [22].

FWA-PSNN based forecasting
The PSNN belongs to the class of HONN and having architecture of double layered as shown in Figure 1.The network is feed forward and fully connected [11].The first layer is a composed of summing units and the second layer is composed of product units.The network input are feed to input layer consist of summation units and the consequent outputs are feed to the output layer consist of a product unit.
The synaptic input-summing layer weights are trainable where as the connection summing-product layer weights are non-tunable and set to a value of unity.Since only one tunable weight set is required, the network achieved a significant decline in training time.Linear activation functions are used at summing layer units and nonlinear activation functions are used by the product units.The order of the network increases by one in addition of each extra summing unit.The product unit offers higher order capabilities to the network by intensifying the input space from lower to higher dimension.In this way, the network avoids being suffered from exponential swell in weights and simultaneously offering better nonlinear separable capacity to the network.

Figure 1. FWA-PSNN based forecasting
The output of FWA-PSNN is calculated as follows.Let the output at j th summing component of hidden layer is computed by summing up the products from each input x i and corresponding weight w ij as in Eq. 9.
where n = volume of input signals fed to the input neurons.The output unit computes the product of outputs from the summing units of hidden layer.It then forwards this ouput to a nonlinear sigmoid activation as in Eq. 10.
Where, m is the order of the network and symbol σ is the product operator.Now the target (y) is supplied to the output neuron, compared with model estimation (y ̂) and the error value is calculated as follows.

error = |y − y ̂| (11)
The objective is to minimize the error function value in Eq. 11.
Each candidate solution represents one potential weight set for the model.The candidate with least error value is considered as the best candidate solution.Now the weight set is optimized as by FWA algorithm.A location (individual) of FWA can be viewed as a potential weight and bias vector for the PSNN in the search space.At beginning, a set of such location is initialized and for each such location, two types of explosion are carried out as discussed in Section 2. The exploration as well as exploitation of the search space is achieved by these explosion methods.The locations are then evaluated in terms of error signal generation.The location with lowest error signal is considered as the best location.The selection process is then carried out with inclusion of this best location and remaining locations as described in Section 2.
The above process continues till an optimal location found and the search process then terminates.The best location is the optimal weight and bias vector for the PSNN model.The FWA-PSNN based forecasting is presented by Algorithm1.
The FWA parameters are chosen as suggested in [22].The TrainData and TestData are generated from the original time series using a window of fixed size sliding over it.The training and test datasets are normalized using sigmoid normalization method.This is explained in Subsection 5.

Experimental data
The crude oil prices (Dollars per Barrel) are extracted from US Department of energy: Energy Information Administration web site: http://www.eia.doe.gov/during the period April 1983 to July 2019.The crude oil price series are shown by Figure 2 -5.The information about the dataset and descriptive statistics are summarized in Table 1 and Table 2 respectively.It can be seen from the Figure 2

Input selection and normalization
A sliding window of fixed size is used for selecting input for the forecasting models [55].In this method rather than selecting all data points observed so far, or on some sample, decision is made based only on some recent data points.On each sliding of the window, a new data point is incorporated and the oldest one is discarded.The window moves through whole time series.Selecting the size of window is a matter of experimentation.For an instance, the patterns generated from the time series using a sliding window for one-step-ahead forecasting is presented below.Here, window size is written as blen and l is the training length.All the three financial time series are normalized before feeding them to the ANN model [56].

EAI Endorsed Transactions on
Energy Web Online First  ).The number of iterations for GA, PSO, FWA, and GD were set to 400, 250, 250, and 600 respectively.The optimal architecture of MLP used is 6-18-1 and that of PSNN is 6-12-1 respectively.The average performance of the models over 20 runs are considered for comparison.
To validate the the proposed model, we developed four other forecasts such as GA-PSNN, PSO-PSNN, GD-PSNN and MLP.All the five models are trained in the similar way.
The error statistics in terms of minimum, maximum, average and standard deviation generated by five models from four datasets are summarized in Table 3.The predicted crude oil prices against actual prices are visualized in Figure 6 -9.
For daily price data all the models are generated lower error values which are acceptable.However, the FWA-PSNN obtains an average error value of 0.0016 which is better than other models.Figure 6 4. It is observed that the statistics obtained are beyond the critical range.Hence, the null hypothesis of no difference between the proposed and other models is rejected.
To check the performance of the forecasts considered we computed the relative worth (RW) of a model [57].It considers the average reductions% in the forecast error of the best performing model over all datasets.Then,

Conclusions
In order to incarcerate the uncertainties coupled with the crude oil prices, this paper proposed a hybrid forecasting model termed as FWA-PSNN.The model uses the fast and effective learning ability of FWA and better generalization property of PSNN for modelling and forecasting crude oil prices efficiently.FWA is used to explore the most favourable weight and bias vector of PSNN based forecast and quite able to minimize the forecasting error.
-5 that the crude oil price time series are highly fluctuating and chaotic by nature.There are random picks and falls on the time series.All the experiments are carried out in MATLAB-2015 environment, with Intel ® core TM i3 CPU, 2.27 GHz processing and 2.42 GB memory size.

Figure 4 .Figure 5 .
Figure 4. Monthly crude oil price data The data after normalization are used to feed the model.To stay away from the stochastic behaviour of ANN based forecast, for each training dataset, we simulated the model for twenty times.The mean error from twenty simulations is considered as the average performance of a model.Adaptive training has been followed as suggested in[13,15,55].During model simulation, different feasible values for the model parameters were experienced and best values are recorded.The acceleration coefficients for PSO are chosen as c1 = 2.15 and c2 = 2.15.The population size was set to 60 and the selection mechanism followed the global best.The GA used a population of size 70, crossover probability of 0.6 and mutation probability of 0.004.It used elitism method for selection operation (20% better fit individual + 80% binary tournament selection).The learning rate and momentum factor of GD were set to 0.1 and 0.3 respectively.The parameters of FWA were set as suggested in [22], (i.e.N = 5, m = 50, a = 0.04, b = 0.8, A ̂= 40 and m ̂ = 5

Figure 7 .Figure 8 .Figure 9 .
Figure 7. Actual crude oil prices v/s estimated by FWA-PSNN from weekly crude oil price time series error from th j model on th i dataset, w i err s the error of worst individual model for the same dataset.D is the total number of datasets.The Sarat Chandra Nayak computed RW values are summarized in 2.

Table 1
Information about four time series Sigma neural network (FWA-PSNN) for modelling and forecasting chaotic crude oil price time series EAI Endorsed Transactions onEnergy Web Online First

Table 2 .
Descriptive statistics from crude oil price datasets

Table 3 .
Forecasting errors from all models shows the closeness of FWA-PSNN estimated crude oil prices and actual prices.Similar results are observed from weekly and monthly price datasets.In cases of annual crude oil time series the average error generated by FWA-PSNN is bit more compared to other three time series.This might happed due to insufficient training sample, i.e.only 36 data points are available in annual crude oil price time series.The observations from experimental work can be summarized as follows: fireworks algorithm based Pi-Sigma neural network (FWA-PSNN) for modelling and forecasting chaotic crude oil price time series Actual crude oil prices v/s estimated by FWA-PSNN from daily crude oil price time series prices  Compared to multilayer ANN, higher order neural network (PSNN here) produced more accurate forecasts A EAI Endorsed Transactions on Energy Web Online First

Table 4 .
Computed DM statistic values [55,58], to find the precise advantage of the proposed model we conducted a statistical significance test, i.e.Deibold-Mariano test[55,58].It compares two or more forecasts pair wise.The computed DM statistics are presented in Table

Table 4 .
It can be observed that the relative worth value of the proposed model is highest, i.e. 69.7968% followed by PSO-PSNN and GA-PSNN with values 42.0950% and 28.9129%.The GD-PSNN model obtained the lowest worth value of 11.1994%.

Table 5 .
Relative worth values of forecasting models The proposed FWA-PSNN based forecasting is validated on predicting one-step-ahead oil price of four real crude oil price datasets.The performance of FWA-PSNN is compared with that of four other forecasts such as PSO-PSNN, GA-PSNN, GD-PSNN and MLP trained similarly.From extensive simulation studies and result analysis it is observed that FWA-PSNN performed superior to others.The current work may be extended by applying the proposed model to other data mining problems.Exploration of other nature inspired optimization techniques can be another direction.[35] Huang, S. C., & Wu, C. F. (2018).