Analysis on Improving the Response Time with PID-SARSA-RAL in ClowdFlows Mining Platform

This paper provides an improved parallel data processing in Big Data mining using ClowdFlows platform. The big data processing involves an improvement in Proportional Integral Derivative (PID) controller using Reinforcement Adaptive Learning (RAL). The Reinforcement Adaptive Learning involves the use of Actor-critic State–action–reward–state–action (SARSA) learning that suits well the stream mining module of ClowdFlows platform. The study concentrates on batch mode processing in Big Data mining model with the use of proposed PID-SARSA-RAL. The experimental evaluation with the conventional ClowdFlows platform proved the effectiveness of the proposed method over continuous parallel workflow execution.


Introduction
ClowdFlows is an open cloud platform that includes composition, execution and interactive sharing of data mining workflows.The ClowdFlows platform works under the principle of service based knowledge discovery in large databases with an interactive workflows.The ClowdFlows platform has a user programming interface, which provides access to 3 rd party services.It also uses the workflow execution in the cloud environment over static data [1].
The ClowdFlows platform is improved to operate on real-time data.Here, the improvements are made to access different operating systems and can utilize the utmost resource of the server to reduce the computational complexity with increased data transfer rate [2].
ClowdFlows platform using batch mode processing to perform distributed computing with MapReduce framework [3].
The main problem associated with data mining in large database is the data scalability.Hence, active or semisupervised learning approach is suitable for better data selection strategy [4] in big data mining systems.Similarly, the use of active learning in ClowdFlows platform increases the parallel processing of data selection.Kranjc et al. [5] used on-line dynamic adaptive analysis in ClowdFlows platform that reduces the computational efforts.The active learning is improved with Support Vector Machine on microblogging data streams [5].
The active learning approaches to improve the data selection in big data sets using ClowdFlows platform is currently less.Henceforth, the proposed method concentrated on a feedback looping system to improve the active learning strategy in large datasets.The use of PID controller in data mining systems can be efficient in selecting the data from big data environments.A feedforward active learning system is used to update the PID controller with learning rules [6], [8] and inverse learning model [7] for improving its adaptation in a particular environment.The adaptive learning of PID controller is improved with reinforcement learning method that considers the input from state convertor [9] and Carlucho et al. [10] used Q-learning strategy.Here in [9], the feedback is taken from the input and in [10], it is taken from the output.In [10], the temporal memory is used to improve the active learning within limited subspaces.
The proposed technique uses reinforcement adaptive learning using SARSA [11] learning model that has an improved rewards per trial [12] than Q-learning strategy.Hence, the use of SARSA learning model as a feedback to the PID controller will improve the adaptive rate of the controller in an environment.Further, the PID-SARSA-RAL technique is analysed experimentally over data selection strategy over large data using ClowdFlows platform.
The main sections of the paper is shown here: Section II involves the proposed reinforcement adaptive learning in PID controller.Section III evaluates the proposed method with the conventional system.Finally, Section IV concludes the paper with future work.

Reinforcement Learning with PID
The Actor-critic model uses Temporal Detection (TD) method with discrete memory architecture that represents direct policy independence over value function.The policy structure is considered as actor, which is helpful in selecting the actions.The estimated value function or critic is used to criticize the actor's action.Here, the active learning is treated as on-policy, where the critic learns the system with TD and critique the current policy for data selection followed by the actor or ClowdFlows platform.
The PID controller [11] is a popular method used in different environments because of its fast and efficient computation.This low-level control procedure strongly improves depends on the parametric setup for its improved performance.The adaptive PID controller is considered as a technology specific system designed to study the behavior of a particular environment with its tuning parameters.

PID control Parameter:
The worker node in the CloudFlows platform [3] implements the SARSA learning with feedback from the PID controller.The discrete-time incremental PID calculated using backward differentiation [13] is selected to process the control variables from the SARSA algorithm.Since, automation is an important concern that improves the data selection in big datasets, the following Eq.( 1) is adopted.
and the integral action of PID is given as: The output of PID controller is given as an input to the actor of SARSA algorithm.The proposed SARSA actorcritic model is implemented over PID controller, which is shown in Fig. 1.

SARSA Learning Model:
The major difference between the Q-learning and SARSA model is that the update policy in Q-learning uses During the process of active learning by these two algorithms, the SARSA includes control policy but Q-learning avoids it.Hence, the SARSA algorithm stores the information before the updating the action values.This makes the SARSA algorithm to attain its target value with less intervals or frequency than the Q-learning algorithm.
SARSA algorithm uses reinforcement learning to study the Markovian decision policy.The update for Q-values depends totally on the current state u 1 (t) with action a 1 (t) and it attains the reward r(t).The next state is u 2 (t) or u(t+1) with next action a 2 (t) or a(t+1).Therefore, the quintuple for the SARSA algorithm is Q(u 1 (t), a 1 (t), r(t), u 2 (t), a 2 (t+1)).
The selection steps of two actions are needed to finds the successive states' action pair in addition with the first pair.The learning rate () and discount factor () parameter of SARSA is same as that of Q-learning.With this, the algorithm for the adaptive learning of SARSA using PID is defined.
Here, the values of  = 0.2,  = 0.9 and  = 1 is considered for the operation.The output of the PID controller u(t) is considered as a state function to the SARSA algorithm.Henceforth, the operation continues as a feedback operator.Instead of direct feedback from the process output, the proposed method uses action variable to find the error difference on PID controller.Finally, the proposed controller design is modified using k(t), which is given as a feedback input to the PID by the actor.This is represented as: The data preparation uses snowflake storage schema for training the agents' upto 15,000 episodes having 500 time window over 10 iterations [14].The selection of features are done through Mutual Information Maximization and the Adaptive Genetic Algorithm (MIMAGA) [15].
Finally, the selection of action sequence uses similar procedure for measuring the impact of proposed method over data mining approach.Two parameters are used for evaluation.The former one is a support measure and the later one is confidence measure.
Support measure defines the rate of usage of a(t).Here, c represents the count, c(a(t)) represents the actions performed by the agent and c(t n ) represents the total number of actions performed by the agent.
Confidence measure defines the success of the action a(t).
where, c(a(t)  u(t)) represents the actions happened for each mining operation, where the agent attains success.c(a(t)) represents the actions performed by the agent irrespective of its success.

Experimental Analysis
The CloudFlow is evaluated with a set of 10 nodes that possess 3.60 GHz Intel i7-6567U CPU loaded with 64-bit Ubuntu 16.04.2LTS over 16GB memory.The Virtual Machines runs over 64-bit Fedora 25work station with 4GB of memory.The system is exercised with the Phoronix Test Suite 7.0.The data mining uses confidence and support measures to assess the quality and impact of the data retrieved from large databases.The core objective is to extract or retrieve the relevant data from large database.The results of the SARSA algorithm with PID controller reduced the response time to settle at the set point value than the conventional PID's as shown in fig. 2.
This resulted in improved response to the given query at the user end.The proposed PID controller is used for measuring the success rate of the feature selection.The selection of features are purely based on inputs from the SARSA algorithm.The error variable in the PID controller is reduced considerably, which results in increased response time in finding the samples or the datasets in the big data.The result of the success rate is seen in fig. 3.  The support rate (without PID and SARSA as input) recorded gets reduced with increased actions or agents in the data space (fig.4.).However, we tried measurements on the confidence measures (fig.5.) and we found that values are too low that reduced the success rate of the SARSA algorithm.However, in contrary, improved response time or reduced delay with PID controller, the support and confidence rate improved.

Figure. 2. Comparative analysis of PID Controllers
Here, the action of searching the unwanted datasets are avoided with the use of MIMAGA algorithm and the updating of the correct value to the end user is done at faster time with the use of PID controller.This can be justified in terms of the results obtained from the success and confidence rate.

Conclusions
This paper improves the design of PID controller by improving the response time.The PID feedback system is integrated with SARSA actor-critic model.The integration improves the success rate of feature selection and improves well the user response time in big data domain using CloudFlows platform.The evaluation in CloudFlows platform reduces the response time using the proposed method.This method could be further implemented over other big data platforms to improve the data extraction process with improved response time.

Figure. 3 .
Figure. 3. The success rate of PID Controllers for the selection of Feature