Intelligent reliability management in hyper-convergence cloud infrastructure using fuzzy inference system

Hyper-convergence is a new innovation in data center technology, it changes the way clouds manage and maintain enterprise IT infrastructure. Hyper-convergence is more efficient and basically agile technology environment. Cloud computing is a latest technology due to provision of latest cloud services over the internet. The cloud service providers cannot promise accurate reliability of their services i.e. problem in provisioning of software or hardware failure etc. Reliability of cloud computing services depends on the ability of fault tolerance during the execution of services. There are so many factors can cause faults, such as network failure, browser crash, request time out or hacker attacks. When users are facing these types of faults, they usually resubmit their requests. However, if there is any key element involved in faults or errors, additional action may be needed to deal with system logs. If there is anomaly behavior occurred in faulted virtual machine, these VMs may need extra attention from cloud system protection and security point of view. In this paper, provision of reliability management in hyper-convergence cloud infrastructure is proposed and self-healing techniques in software as a service on the basis of failure in cloud services. Intelligent cloud service reliability framework will increase the reliability during execution of cloud service.


Introduction
In current era, we are living in the universe where data is being produced day by day on demand, on command by the organization, institutions and many other firms.The amount of huge data that we consumed and produced by different smart devices.Devices like, smart phone, computers and sensors.Most of the tools and individuals are at high rate and produce huge amount of data.It is difficult to manage huge amount of data [1].New technologies introduced to overcome these issues.To overcome these issues and making our work cost effective, we use cloud computing services.Cloud computing is cheap and pay-per-use, this attitude of cloud computing putting resources over cloud infrastructure.Cloud computing service uses internet to provide service to consumer and use data center to host applications.Cloud services are available for consumer as pay-per-use, quality and services over the internet [2].

Essential characteristics of Cloud Computing
Some of the most important characteristics of cloud computing is following: that the concept of a ranking system came into being.This ranking system receives the requests from different users, which may differ w.r.t their requirements.Then, this system will look for some services Nadia Tabassum 1,* , Muhammad Saleem Khan 2 , Sagheer Abbas 2 , Tahir Alyas 3 , Atifa Athar 4 and Muhammad Adnan Khan 2  for users and assign a possible rank according to the Quality of Service (QoS) [3].
1.1.1.On-Demand Service: On-demand service is a model or technique in which we just provide the cloud users the facility that they can get their services from service providers anytime at any place.The users can get this facility to fulfil its work or any application so the users can have this facility to avail the services on demand.The service providers provide these resources to their users at any time [4].
1.1.2.Broad Network Access: By its name we can get the idea of it the cloud system is just used in broad network area so that everyone can get access their services.Most of the companies can use this facility to remain updated to their clients or other organizations by using the cloud services and anyone can use these services but it is depend on the network used like whether it is private or public it means if you use the private cloud than all the information will be under the members of the private members those are sign in to the cloud services but if it is public network than anyone can get the access of about any information and anyone can get the advantage of the services provided to the cloud users [5].

Resource Pooling:
Resource pooling is a technique in which the consumer can acquire and release the resource when it will be required on demand.The PaaS users can get the resource from the resource pool on demand so that the user can make use of this resource and then give it back to the resource pool.It also reduces all the complexities that cloud has to face by using resource pooling techniques [6].

Rapid Elasticity:
Rapid Elasticity refers to enhancement of cloud services without effecting the cloud users and buyers all the services in a very reliable and flexible manner.Cloud just provides their services to their users very easily and their users can easily get the services and get advantage of these services on demand.The cloud users can get extra storage space to have more resources from cloud provider in this way they can use more services [7].

Measured Service:
Measured service means paying cloud service cost as per their usage.It also known as metered service.In measured services all the problems and faults are being controlled and monitored.Measured service is a term that IT experts apply to distribute computing

Cloud Service Models
The most commonly used service models through which the cloud system provides different services to the users consumers are following: 1.2.1.Infrastructure-as-a-Service: Infrastructure is just like a support or foundation which is used to provide these types of services to your customers and cloud users like by using infrastructure u can give resources to the equipment's that are used in different works like virtualization, storage area, networking etc.You can easily give these resources or facilities to your customers and cloud buyers and users to enhance your services provided to the customers.You can increase your storage area and the network speed provided to the customers so that they can easily get use the benefits of the best networking speed available [8].

Platform-as-a-Service:
Platform as a service is a worldview for conveying working frameworks and it is used to provide the cloud users all the facilities through internet which is a worldwide network where we can access and download anything but the difference is quite simple that these facilities are provided by the cloud service providers and you just need to pay for what you used on cloud.PaaS gives a stage apparatuses to test, create, and have applications in a similar domain [9].

Software as-a-Service:
Software as a service is used in which the user does not need to download any application on its own computer because now the SaaS is available through which we can access any application not just only on our own computer but at any place.It is a model in which the other person through internet gets access the other persons and they can access these all application on internet.It also decreases the expenses of installation, provisioning cost etc [10].

Data reliability in cloud computing
Reliability means working the virtual machines, if the exceptions and malfunctioning occur.The system functioning is error free and in good conditions for service delivery.Service consumers consider reliability as proper functioning, security, and ease of use.Service providers also consider reliability in service creation, deployment, integration and separation.[11] The reliability in cloud computing environment also includes providing proper functioning in different stages in service lifecycle.Service integration and separation allow service providers to offer both full set of functionality and part of functionality to service consumers according to service level agreements.The reliability covers various aspects of cloud computing.The base line of the reliability is to provide functioning services.[12] 1.

Types of Failure in cloud computing
In conventional software engineering reliability, four main approaches to design a reliable software system.These four approaches are fault prevention, fault removal, fault tolerant, and fault forecasting.However, in cloud computing environment, cloud applications only accept faultprevention techniques and fault removal techniques to develop fault-free software as service [13].
The large-scale cloud services involve large number of virtual machines and middle ware layers.Failures of these components affect reliability of cloud applications directly.
The idea of most cloud service providers deploys their services in large data centers.All of services are running in virtual machines that reside in physical machines.There are usually multiple virtual machines running in one physical machine.When a virtual machine is initialized, the administrator or virtual machine monitoring system gets resources from a resource pool to build requested virtual machine.Reliability is cloud computing under the different conditions like network resources, latency and cloud monitoring can result in low performance.Such conditions must be observing by autonomous system to avoid the delivery of cloud services failures [14].

Figure 1. VM Checker
In order to enhance reliability, we need to identify faults.system event logs record most of system events that include system fault related events.We can trace system faults through system event logs.There are always system critical events happened before system enter fault states as shown in figure 1.Therefore, if a system could predict system critical events, it can predict system faults before they really happen.Researchers dig into this problem from different aspects.Following study presents several techniques for system fault monitoring.Through machine learning techniques, we can find some patterns that always appear when system faults occur.Statistical data is used in mining and detecting fault patterns in service event logs are normalized according to domain information in Memory Module.

Related Work
Cloud computing has allowed the efficient management, deployment and configuration of clusters where the aforementioned frameworks can be deployed by taking advantage of both the elastic nature of the cloud, where reserved resources are used for as long as they are needed (i.e., pay-as-you-go), and the deployment/management ease of use.To exploit these two properties and fulfil enterprise needs for minimizing infrastructure maintenance and operation costs, companies follow a similar approach for either public or on premise cloud offerings: they utilize a service that launches and manages big data clusters in order to execute the requested workloads whereas a single storage back-end is used to host data [15].
Ian andrusaik proposed framework in his paper titled -A Reliability-Aware Framework for Service-Based Software Development‖ allows the idea of hyper-convergence is to simplify operation and management of data centers by converging the computing, storage and networking components into a single, software-driven appliance.It's defined as an IT infrastructure framework in which storage, virtualized computing and networking are tightly integrated within a data center.A prototype implementation has been developed as a proof of concept of the design which has been evaluated and showed that the system is successful at providing availability when failure occurs at a cost to overall performance [16].Different autonomic monitoring system is proposed by the researchers.Monitoring framework is proposed by [17] for the web service-enabled applications.In paper he discussed the hardware and software resources in the form of web service-enabled.The elasticity of both hardware and software resources are monitored.In autonomous system, a self-optimized monitoring algorithm is proposed which updating dynamic information and self-adaptive events.
Fault prevention and fault tolerance intend to give the capacity to convey a service that can be trusted, while fault expulsion and fault forecasting mean to achieve trust in that capacity by defending that the utilitarian and the steadfastness and security details are sufficient and that the framework is probably going to meet them.It is significant that repair and fault tolerance are connected ideas; the Intelligent reliability management in hyper-convergence cloud infrastructure using fuzzy inference system refinement between fault tolerance and support in this paper is that upkeep includes the cooperation of an external agent [18].
In a cloud storage framework, many components like storage, services, hardware can result in data failure.Data failures additionally prompt cloud benefit failures.The fundamental causes of cloud data failures hardware, system, software and power failures.Data reliability incorporates augmenting solidness and accessibility of data.Durability mitigates perpetual failures and accessibility mitigates transient failures.
For automation in reliability monitoring, an agent-based approach is helpful where diver diverse provision of software services is required.This approach will support in automated system in every unconditional situation where software behaviour possible to specify.In autonomous situation, agent can evolve, learn, cooperation with entities and negotiation can perform.Expanding system required agent behaviour role while rapid change occur [19].
When the indexing procedure is going on in cloud services, the key factor is that the requirement of the user should be satisfied.Such kind of framework is desired that will fulfil these requirements.By looking on to the above figure, it is known that indexing manager will receive the information and after that, process it according to the ranking parameters like performance, usability, and cost.Indexing Manager will consider it for the best service as desired by user necessities.Indexing Administrator will also be answerable for other activities as well, i.e. taking characteristics for ranking, the track record of characteristic value, and ranking result [20].

Transformation of hyper-converged
The journey of transformation is started from the conventional system.In traditional system, all modules need different skill to manage.All entities configured and tested separately as shown in figure 2. In converged era, the hardware defined infrastructure was introduced along with monitoring software and backup.In hyper-converged, all server components on single unit and integrated through software defined environment.All the components are readily available and ready to use [21].
Data centre is a facility that contains several computers that are connected together for the purpose of storing and transmitting data.The facility is designed to be used by several peoples and is equipped with hardware, software, peripherals, power conditioning, backup, and communication and security systems.Different architectures of data centres involves Traditional infrastructure (TI), Converged infrastructure (CI) and Hyper Converged infrastructure (HCI) [22].

Proposed Methodology
In this proposed model, the first it will collect the service history, service weight, QoS parameters, execution time.After collecting the service detail, it will analyzed the virtual machines status and calculate the utility, if the existing pattern are available in Memory Module then no need for new plan.In provision of reliable services in cloud computing, system will determine the planning strategy in decide phase.At the plan phase, the system will generate the plan according to the situation and finally execute the plan.
The proposed framework of Intelligent Cloud Service Reliability management as shown in figure 3  Intelligent reliability management in hyper-convergence cloud infrastructure using fuzzy inference system 1.
Service Event Log Check 3.

Network Resource Check
The Service Monitoring Agent is responsible for monitoring provision of cloud services over the network.
When cloud services are provisioning over the network through user interface, the service are being execute will monitor through our proposed intelligent cloud service reliability framework.In Service Monitoring Agent, service monitoring as well as service analysis will be monitored.In service examination layer, there are three sub modules to monitor the virtualized cloud service through different check points.
In service examination layer, the cloud service will be examined through Network resource check.

Figure 3. Proposed Intelligent reliability management Framework
In Virtual machine checking, the health status of all virtual machine will be monitored through VM checker as shown in figure .3. Fabric controller monitor all the virtual machine through fabric agent.If any virtual machine is not working then the fabric agent will report to fabric controller for alternate virtual machine to provide error free service delivery over the network.Similarly, the fabric controller will check the status of host machine.Service Event Log checking will responsible for the auditing of cloud service in the form of log checking and log will be analyzed with the history checking in service usability layer.Network resource check include the physically resources during provision of cloud services.

Service usability Layer
This layer is responsible for the cloud service history analysis.This layer is divided into two sub layers.

History analysis 2. Event analysis
History analysis layer will keep record the failure types and stored in Memory Module for future failure predictor.
Event analysis will be responsible for the event failure status and send status for event prediction to recover such type of failures.

Self-Healing Layer
This layer will responsible for recovery of failure components in cloud service provision.This layer will keep track of all the available virtual machines and recovery of failure by keeping the track of such failures and learn the previous event failures.Self-healing layer is divided into two sub modules 1. Switching 2.

Re-composition
In provision of reliable cloud service provision, the switching will be performed if, the recovery of virtual machine is not working after re-composition.The fabric controller will activate the new cloud service after checking the service examination layer parameters.
While service healing process the recovery of service is subject to system security analysis.The required service is for the same host or system.During healing process, the system will check the service examination in term of virtual machine, Host machine, designation machine, current service status, network resource.

Implementation and Evaluation
In this section Mamdani Fuzzy Inference System is used to simulate the proposed reliability management in cloud computing.The description of fuzzy inference system (FIS) is explained in Figure 4.

Inference Engine
Characterizes administrators and defuzzifier utilized as a part of the surmising procedure..Eq-1 the rule viewer shows that the reliability is at re-composition state, where virtual machines need to re-compile again.

Simulation and Results
MATLAB 2017b is used for simulation purpose.Intelligent reliability management in hyper-convergence cloud infrastructure using fuzzy inference system EAI Endorsed Transactions on Scalable Information Systems Online First If SRT value is 5, VMU value is near about 5, event log value is 5, NRU value is 5.5 then the reliability will be in Low as shown in figure 8.
If SRT value is 1.22, VMU value is near about 10, event log value is 3.7, NRU value is 10 then the reliability will be medium as shown in figure 9.If SRT value is 10, VMU value is near about 9, event log value is 9, NRU value is 8 then the reliability will be high as shown in figure 10.
Intelligent reliability management in hyper-convergence cloud infrastructure using fuzzy inference system EAI Endorsed Transactions on Scalable Information Systems Online First  ] }

Conclusion
In cloud computing, service providers always want to provide reliable services to customers or service consumers.However, there are obstacles between service providers and consumers.Customers need customized services with various configurations, these customizations and configurations are error free and seamless functionality during service execution.In software as a service, monitoring of virtual resources are hard to control the stability of their service, especially from end to end service provision.

Figure 2 .
Figure 2. Evolution of hyper-convergence consists of five major modules. Service Monitoring Agent  Self-Healing Layer  Service usability Layer  Learning Module  Memory Module 3.1 Service Monitoring Agent Service Monitoring Agent is divided into three type of monitoring checks Nadia Tabassum et al.

Figure 4 .
Figure 4. Proposed Fuzzy Inference System for IRM For the reliability management in cloud computing, Service response time (SRT), Virtual machine utilization (VMU), Event Log and network resource check will perform

Figure 6 .
Figure 6.Rule base for Proposed IRM

Figure 7 .
Figure 7. Rule surface for Reliability Management

Figure 8 -
10 shown the performance of the proposed Intelligent Reliability Management system in terms of low, medium & high reliability.

Figure 10 .
Figure 10.Lookup diagram for High Reliability EventLog and NRU.The parameter of SRT explains if the response time of cloud execution is low then the reliability in cloud is high.Three membership variables for SRT is used low, medium and high which explain the degree of reliability.We can say the reliability is high if response time of SRT is low, medium and high.The second input variable is Virtual machine utilization (VMU) having three membership functions low, medium and high.The third input variable is EventLog having three membership functions No, Yes and Critical.If the value of EventLog is no then no uncertainty is seen during cloud service execution.If the value of EventLog is Yes then cloud service is not functioning during cloud service execution.

Table 1 .
4.2.Membership FunctionsGraphical & mathematically representation of the above mentioned I/O MF of AFIS Input variables are shown in table 7. Detail of each input variable is explained in Table.2-6.In Table 1 showed the possible outcome of the four inputs.The detail are explained in table 7. Proposed Lookup table for IRM

Table 7 .
Mathematical & Graphical MF of AFIS Input/output variables

Table 8 .
Mathematical & Graphical MF of AFIS Input/output variables