Predictive Analytics In Weather Forecasting Using Machine Learning Algorithms

Agriculture is the backbone of every economy. In a country like India, which has ever increasing demand of food due to rising population, advances in agriculture sector are required to meet the needs. To add to it, the present economic conditions and government policies of India are such that it necessitates the adoption of Precision farming or smart farming. It will enable the farmers to maximize their crop yields and minimize the input costs as well as the losses due to reasons like uncertain rainfall, droughts etc. from this model. For Predicting weather forecasting we will use machine learning Algorithms like Linear Regression, Decision tree.


Introduction
Machine Learning Technique is most robust technique for predicting weather forecasting.In past days we had to give instructions to System and then it gave result.but now we have machine learning algorithm so we can directly give inputs and feature and it generates result automatically.Just we need train the data then it generates model and features.[1]Most of the work related to machine learning for agriculture either solves the purpose of cultivating a crop and suggest weather data based on the statistical information.[2]Most of the work does not handle the planting of crops based on the climate.[3] plant diseases and insect pests causes significant reduction in quality as well as quantity of agricultural product so plant disease and insects pests forecasting is of great significance and quite necessary.

Machine Learning Algorithms
Machine learning algorithms are described as learning a target function (f) that best maps input variables (X) to an output variable (Y): Y = f(X).

Linear Regression
Linear regression is the most basic and frequently used predictive model for analysis.Regression estimates are generally used to describe the data and the elucidate relationship between one or more independent variables and dependent variables.Linear regression finds the best-fit through the points, graphically.The best-fit line through the points is known as the regression line.

OLS Model
Ordinary least square model is the most common estimate method that is used in linear model.It is used for getting best estimates.It minimizes the sum of square in the dependent variable.This helps us to find the relationship between dependent variable and independent variable.As it calculated the distance between predicted value and actual value.

Advantage
• Simple mathematical representation.
• It doesn't take extra-large memory.
• It is very easy to clarify.Because it has numerical results.

Disadvantage
• It requires linearly spread data.If we have more features it doesn't provide accurate result.
• The linear regression model fails when we have nonlinear data.

Algorithm Steps
• Import all libraries and read weather data.
• Split and train data then test the data.
• Create linear regression model.
• Predict weather for future.

Decision Tree
It is a type of supervised learning algorithm that we mostly use for classification problem. it works for two dependent variable categorical and continuous dependent variables.In this type of algorithm, we split the population into two or more homogeneous sets.This is done because of most significant attributes/ independent variables to make as distinct groups as possible.

Data Shaping
It is a type of supervised learning algorithm that we mostly use for classification problem. it works for two dependent variable categorical and continuous dependent variables.In this type of algorithm, we split the population into two or more homogeneous sets.This is done because of most significant attributes/ independent variables to make as distinct groups as possible.

Label Selection
After shaping the data we select labels as features.We create labels for classify data.After labelling we move to splitting.We select first two columns from data for labelling.

Splitting
After labeling we splits the labels for finding best feature.Based on the best feature result we only get the accurate predicted result.

Advantage
• This algorithm handles both the continuous and categorical data.• When we have non linear data decision tree will be useful.Because it splits the data set for creating more features.

Disadvantage
• If you have more features, your decision tree is probably going to be the deeper and bigger.• It normally over fits a lot as it creates highvariance models.

Algorithm Steps
• Import all libraries and read weather data.
• Shape all data.
• Classify and train the data.

SKlearn
It is very useful library for machine learning modeling.It initially released on 2007.It includes lot of machine learning algorithms.In this library we use modules like DecisionTreeClassifier, train_test_split, accuracy_score.

Numpy
It is basically used in mathematical operations.It reads the data as numpy array for the manipulation purpose.It provides fast mathematical functions for calculation.For machine learning it is very common library.
Predictive Analytics In Weather Forecasting Using Machine Learning Algorithms 3

Data Preprocessing
The more you preprocess the data set, the more accurate result you will get.basically, it is the process where we remove some unwanted or not useful, noisy data from the collected data.Also, if we don't remove any null value or empty field then we cannot get the proper results.So, it is very important process to develop the model.

Normalization
It is also known as machine learning module.Here we train the collected dataset, test the dataset and then generate the new model, again for cross validation we blind the dataset.

Learn Model
This is the last process, In this phase we learn from model and predict the result.Learning model is important we have evaluated proper result.Here we get the artefact model from the training process.

Future Scope
As for future scope we can't able to use linear regression when it comes to huge amount of data set and as its doesn't give accurate result.So, for predicting huge volume of dataset we can develop a neural network system for more better results and accurate prediction of the weather forecasting.Also we connect analysing process to IOT technology.Because without data we can not perform analysis and prediction because IOT is major source of data.So IOT will generate data from devices which helps to take initiative to improve decision making.

Conclusion
Machine learning algorithms plays a major role in predictive analytics, which uses the current and past historical data sets to discover knowledge from it and by using that data it the predict future occurrences.In this paper we have proposed two algorithm such as linear and decision tree for weather forecasting and prediction.we have concluded that linear regression is best when predicting weather forecast which have dependent dataset because already we have linear data for linear regression but for decision tree, we must give the label manually and the main Disadvantage of the decision tree is If you have more features, your decision tree is probably going to be the deeper and bigger and other one is that It normally over fits a lot as it creates high-variance models.