Intelligent Control Algorithms in Power Industry

Nowadays more and more attention is given to energy development problems. Modern ways of producing energy are bounded by planet’s resources of coal, natural gas and petroleum, and harmful effects of waste produced by energy industry forced the increasing growth of alternate ways of energy development, such as wind farms, geothermal and solar energy, etc. The article considers a solution the problem of creating energy technologies for autonomous decentralized energy supply, using intelligent automated control systems.


Introduction
One of the main problems of energy industry is it's variability.The reason why in a hunt for increasing energy efficiency the need in high-level control algorithm arises.To solve this problem we need to develop a top-level complex control algorithm.According to this direction, reinforcement learning, using fuzzy logic techniques were used.[1][2][3][4][5][6][7][8][9][10][11]

Reinforcement learning
Because of the fact that exist different situations in which the resources provided by the problem are poor to build Supervised Learning algorithms and also there is no precise information about the data on which learning should be done; moreover there might be no former sample data set available, where some unsupervised learning approaches depend on.Because of all of these limitations, the Reinforcement Learning (RL) seems the best variant to deal with such kind of situations.
RL is a way of learning behaviours for agents by interacting with their environment without any explicit teacher.The agent B in each moment, or time step, environment state through its sensors denoted as input and decides to perform an action a t based on this sensory information and the feedback from the environment previous action (reinforcement signal), towards a specified goal, that of course the environment forces it to do so.Therefore the learning's goal simply, is to find best possible action in each state that will result in the maximum total sum of the reinforcement signals, rewards.
The input function I is the sensation function of the current state of the environment; it shows the way the agen views its surroundings.R is the reward function that is mostly unknown to the learner in detail.
Thus the model has been constructed on top three formal bases:

Agent model
As a main concept of working agent was taken structure of adaptive heuristic critic (AHC).
The structure is based on two main concepts:  Adaptive Critic Element (ACE or Critic);  Associative Search Element (ASE or Actor); The critic (ACE) role is to provide a more informative critique (internal reward) signal than a raw and rather poor primary reinforcement signal from the environment.It actually represents an approximation of the value function of the current policy running in actor component.Both critic and actor parts can learn simultaneously: the actor tries to develop the t-optimal policy (the optimal policy at time step t) with respect to the critique value, while at the same time critic tries to learn the value function of the current policy initiated in actor.Critic learning is based Difference (TD) learning, whereas actor learning is called the reinforcement comparison method.Actor imple D. V. Mikheev 2 in each moment, or time step, t perceives its environment state through its sensors denoted as input i t , based on this sensory information and the feedback from the environment r t on previous action (reinforcement signal), towards a specified goal, that of course the environment forces it to do so.
s goal simply, is to find best possible action in each state that will result in the maximum total sum of the reinforcement signals, rewards.
is the sensation function of the current state of the environment; it shows the way the agent is the reward function that is The critic (ACE) role is to provide a more informative l reward) signal than a raw and rather poor primary reinforcement signal from the environment.It actually represents an approximation of the value function of the current policy running in actor component.Both critic sly: the actor tries to optimal policy (the optimal policy at time step t) with respect to the critique value, while at the same time critic tries to learn the value function of the current policy initiated in actor.Critic learning is based on the Temporal Difference (TD) learning, whereas actor learning is called the reinforcement comparison method.Actor imple-mentation is application dependent to some extent and also depends on the method of the output generalization; meanwhile learning, an update rule of parameters involved in action selection, can be utilized based on the TD error provided by the critic, so that to increase the tendency towards selecting more beneficial actions.[Ref.3

Fuzzy inference system
Fuzzy inference system (FIS) is a system to deduce output decision from fuzzified inputs by applying fuzzy rules (actually a fuzzy expert system).Fuzzy inference is the process of formulating the mapping from a given input to an output using fuzzy logic.The mapping then provides a basis from which decisions can be made.There are mainly two types of fuzzy inference systems that can be implemented: Mamdani-type (M-FIS) and Takagi These two types of inference systems vary somehow in th way that outputs should be determined.FIS actually consists of a rule-base including a collection of fuzzy If to mimic the way of the human expert process.[Ref. 3] The structure of fuzzy output can be presented in a form of a multilayer neural network.This type of neural networks can be easily implemented, the structure of neural network allows the easy way to describe given architecture.
This neural network, which structure is similar to feed forward neural network consist  Input layer. 2 Hidden layers. Output layer.Neuron weights for each layer, except output layer, are fixed and equals one.Weights for output layer are the values for corresponding fuzzy logic rules.Described system is presented on Figure 3. mentation is application dependent to some extent and also depends on the method of the output generalization; n update rule of parameters involved in action selection, can be utilized based on the TD error provided by the critic, so that to increase the tendency more beneficial actions.[Ref.3] ystem Fuzzy inference system (FIS) is a system to deduce output decision from fuzzified inputs by applying fuzzy rules (actually a fuzzy expert system).Fuzzy inference is the process of formulating the mapping from a given input to an The mapping then provides a basis from which decisions can be made.There are mainly two types of fuzzy inference systems that can be implemented: FIS) and Takagi-Sugeno-type (TS-FIS).These two types of inference systems vary somehow in the way that outputs should be determined.FIS actually consists base including a collection of fuzzy If-Then rules to mimic the way of the human expert decision making The structure of fuzzy output can be presented in a form a multilayer neural network.This type of neural networks can be easily implemented, the structure of neural network allows the easy way to describe given architecture.
This neural network, which structure is similar to feed forward neural network consists of 4 layers: Neuron weights for each layer, except output layer, are fixed and equals one.Weights for output layer are the values for corresponding fuzzy logic rules.Described system is AHC and FIS The system is made for the pattern "Many inputs output", however the realization of system "Many inputs many outputs" is reached by simple addition of output neurons on the output layer.[Ref.2]

Adaptive heuristic critic system using FIS
The most important quality of decision making algorithm using reinforcement learning techniques and AHC with FIS for controlling energy station is that it gives the possibility to combine expert knowledge and data from sensors.This approach reduces time of learning and simplifies the control, using simple, understandable rules.Simplifying working algorithm I can present the sequence of steps:

The architecture of a distributed system to collect energy consumption
The architecture of a distributed system to collect data on energy consumption can be separated levels, see figure 4: 1) Users of energy resources.The main goal of developing system is to collect user data about energy consumption, analyze it, and give recommendations to users of the system how to use energy resources on the most economically way.Therefore, system users are the main elements in the architecture of the data collection system and is the reason why this system exists.
2) Measuring equipment provides direct measurement and collect data about the consumption of various energy resources.
3) The first level of the monitoring equipment collects data from all measuring devices, performs preliminary processing of this data, as well as reliability.It allows to keep measuring equipment in an efficient state.4) Second level of the controlling equipment collects data from the monitoring equipment of the first level, monitors its operability and evaluates the working capacity of the system.Controlling equipment of the second level is not 3 The system is made for the pattern "Many inputs-one output", however the realization of system "Many inputsmany outputs" is reached by simple addition of output . Adaptive heuristic critic system using FIS The most important quality of decision making algorithm using reinforcement learning techniques and AHC with FIS for controlling energy station is that it gives the possibility ata from sensors.This approach reduces time of learning and simplifies the control, Simplifying working algorithm I can present the sequence of Fuzzification of the new perceived input state X t+1 and truth value computations ding the value of the state X t+1 based on the .The architecture of a distributed system to collect energy consumption data The architecture of a distributed system to collect data on on several basic Users of energy resources.The main goal of developing system is to collect user data about energy consumption, analyze it, and give recommendations to users of the system how to use energy resources on the most efficient and economically way.Therefore, system users are the main elements in the architecture of the data collection system

Measuring equipment provides direct measurement and ion of various energy
The first level of the monitoring equipment collects data from all measuring devices, performs preliminary processing of this data, as well as reliability.It allows to keep measuring equipment in an efficient state.
Second level of the controlling equipment collects data from the monitoring equipment of the first level, monitors its operability and evaluates the working capacity of the system.Controlling equipment of the second level is not compulsory in the system, but it is necessary to create large local data collection centres for handling large amount of data, which can not be handled by the first level equipment.5) The global data collection and processing center are both a single place where data from all loca It handles all received information and provides end recommendations about the most efficient way to use energy resources.

Summary
Here presented intellectual algorithm for controlling autonomous wind farm.Stochastic environment and imprecise inputs to wind farm systems, lead us to use modern techniques of computer science and dealing with decision making under uncertainty.The algor on known approaches in neurocontrol theory, decision making under uncertainty, fuzzy logic theory.The usage of Fuzzy Inference technique in combination with adaptive heuristic critic system was here examined.This algorithm is developed under a part of scientific project of developing intellectual algorithm and managing system for controlling autonomous wind farm, described in [Ref. 2 is decided to use because of its quality to fluently combine expert data and sensor data, and learnability.but it is necessary to create large for handling large amount of data, which can not be handled by the first level equipment.
The global data collection and processing center are both a single place where data from all local centres is collected.
all received information and provides end-users recommendations about the most efficient way to use energy chitecture of data collection system Here presented intellectual algorithm for controlling autonomous wind farm.Stochastic environment and imprecise inputs to wind farm systems, lead us to use modern techniques of computer science and dealing with decision making under uncertainty.The algorithm is based on known approaches in neurocontrol theory, decision making under uncertainty, fuzzy logic theory.The usage of Fuzzy Inference technique in combination with adaptive heuristic critic system was here examined.This algorithm is r a part of scientific project of developing intellectual algorithm and managing system for controlling wind farm, described in [Ref. 2].The algorithm is decided to use because of its quality to fluently combine expert data and sensor data, and providing fast and simple

Figure 2 .
Figure 2. Adaptive Heuristic Critic (AHC) model Thus the model has been constructed on top three formal : It can discrete or : It can be discrete or Set of scalar reinforcement signals: Either Boolean valued.[Ref.2] As a main concept of working agent was taken structure of The structure is based on two main concepts: Adaptive Critic Element (ACE or Critic); Associative Search Element (ASE or Actor); Adaptive Heuristic Critic (AHC) model

Figure 3 .
Figure 3. AHC and FIS ( ) Updating FIS output, critic parameters, and action  + ̃  Calculating and saving value of current state, based on new value function after tuning conclusion values: Action selection procedure with maximum utility

Figure 4 .
Figure 4.The architecture of data collection system environment states, S: It can discrete or continuous, finite or infinite. Set of agent actions, A: It can be discrete or continuous.