Oil and Gas supply chain optimization using Agent-based modelling(ABM) integration with Big Data technology

The worldwide oil & gas industry is one of the world's most complex business networks, and is connected with almost every supply chain branch. It includes international and domestic transportation, materials handling, ordering and inventory visibility and control, import/export facilitation and social network, etc. Traditionally, it has been influenced by big oilfield companies. However, in recent years the industry has been changing into a more heterogeneous and diverse network of businesses, and the oilfields are getting smaller and more diverse. One of the reason could be dwindling the oil reserves and growing specialized companies which are able to extract hydrocarbons; another reason is the restructuring and globalization of the entire business as well as some new technology implementing. Using agent-based modelling and big data technology integrity, we are able to optimize supply chain in oil and gas industries.


Introduction
The global demand of Oil & Gas followed by the ease of international trade and the inflexibility involved in the petroleum industry's supply chain has made its management more challenging [1]. The oil and gas industry is principally a Supply Chain Management (SCM) industry, which involves the management of all steps in the delivery of a product or service to consumers. It consists of exploration, drilling, operation of pipelines, and operation of refineries for the production of fuel, plastics, and so on. Trunk line or "transmission pipes" are the arterials that deliver refined products such as gasoline and aviation fuel to terminals at various locations. Distribution refers to the sale and delivery of these products to consumers from storage terminals. Forrest and et al [2], identified that the majority of oil industry still operates its planning, central engineering, upstream operations, midstream operations, downstream operations and refining, supply, and transportation units as complete separate entities. Each part of these process provide huge amount of data and dealing with such them require a huge infrastructure. In view of this, systematic methods for efficiently managing the oil and gas supply chain as one continuous unit must be exploited. More efficient and cost effective SCM practices in the Oil & Gas industry indicate critical issues for ensure the continuous supply of crude oil, lead time reduction, and lowering of production and distribution costs. Due to the inflexibility involved in the petroleum industry's supply chain network, cost containment, visibility, globalization, Risk, information technology logistics, knowledge management and greening the supply chains are some of the challenges facing the SCM in the oil marketing companies as advanced by other researchers. Integrated process management(IPM), Information sharing and management, variety and complexity of generated data by each sector, organizational restructuring, and cultural reorientation are crucial factors and equally important.

EAI Endorsed Transactions on Smart Cities
Jamal Maktoubian et al.

2
A wide range of optimization models have been proposed in the oil and gas supply chains, often without taking the inherent risks and vulnerabilities from events, different routes and supply chain units/nodes into consideration. These uncertainties and risks often interrupt the supply chain operations, causing significant adverse effects in the energy sector. It is important to develop risk based optimization models using advance technology in order to predict, reduce or mitigate the impact of these uncertainties in the oil and gas critical infrastructure supply chain. Although Oil and Gas supply chain has a simple procedure, a little optimization in each agent (storages, refineries, retailers, pipelines, suppliers, carriers, freight forwarders, tankers, to name but a few) could save millions of dollars in a month. Moreover, Visibility in the supply chain being the main issue, the key challenge lies in the process optimization of each enterprise. For the question of efficient supply chain for a mass production industry like that of petroleum industry, the solution lies in the adaptive supply chain, which make a more holistic approach to the supply chain optimization [3]. The integrity of big data technology, agent-based modelling(ABM) and simulation is reliable and efficient methods for solving complex problems of today's world. Many components of supply chain systems have been modelled utilizing an agent-based software. Garcia-Flores and Wang constructed an ABM to simulate dynamic behavior of cooperating agents along a single supply chain [4]. The specific operation of a warehouse system was designed by Ito and Abadi [5]. Models of refinery supply chains have been used to determine optimal business processes and configurations in one specific plant [6,7]. Combining big data methodologies with particular, agentbased modelling first, would be employed to uncover new relationships and behaviour coping with agents in agentbased models that were not recognized by the traditional ones. Secondly, it could be beneficial for extracting more parameters for agent-based models, as the size of data sets will be increased. And then finally, after providing the results, they can be utilized to validate and comparing simulation with existing agent-based models. By doing this, oil and gas supplier could take advantage of both approaches to identify agent behaviours and options in an unprecedented manner. To do that, we need big data and simulation frameworks, and visualization tools allow enduser access simulations without being notified of the operations. Hopefully, these types of analyses and computational challenges are already covered by data science experts.

Big Data in Supply Chain Management
Nowadays, it gets more and more important for oil and gas companies to manage and oversee their supply chains in an effective manner in order to reduce cost as well as to enhance and guarantee efficient operations. Due to the incipient digital transformation process, expected to radically alter business ecosystems, change management practice and revolutionize supply chain dynamics, the management of data and information, being the raw materials of the digital age, is becoming increasingly important for businesses. The rationale is that the amount of data and information generated by, available to and collected through companies is growing at an unforeseen fast pace.
The term Big Data Analytics has been coined in this respect, reflecting the volume, velocity, and variety surge of digital data which increasingly poses a challenge for companies, as it complicates the identification and extraction of the most relevant and valuable information required for managing the business and ultimately the supply chain [8]. However, having access to accurate and up-to-date information is paramount for informed decision-making at corporate as well as supply chain level. In turn, not having access to up-to-date, accurate and meaningful information represents a risk for companies and subsequently for the supply chain, as decisions need to be made on a reliable, evidence-driven basis. Other factors being the increased need for end-toend visibility along the supply chain, enhanced automation levels as well as required efficiency gains at the manufacturing level.
Apart from the analysis, machine learning is helping to improve the accuracy and efficiency of supply chain management. The evolution to using artificial intelligence and machines that learn in supply chain planning is inevitable. In fact, there are early examples of the potential of AI to improve both supply chain planner efficiencies and provide better or optimized supply chain decisions. Oil & gas supply and trading aims to optimize commercial margins in a diverse, dynamic, and global market environment. Data mining, machine learning, and predictive analytics solutions could be provided that empower decision makers to analyze trading risks. Effective optimisation requires analysis of the interaction of several variables in supply chain datasets. In supply chain machine learning, we need to measure relationships between attributes to find hidden patterns among them. For instance, how different factors such as materials, ordering and inventory visibility and control, import/export facilitation, and transportation, etc. can influence oil and gas industry.

Agent-Based Modelling (ABM) in Oil & Gas Industry
ABMs are computational systems with dynamic behavior and special characteristics that define as "interacting autonomous entities which is called agents" [9]. Using agents, we able to show and employ individuals at several levels, learning capability, and make best decisions in both space and time. Having this features, researchers are able to examine complex systems and environments, such as oil and gas supply chain. Cutting costs by using new information technologies and creating conditions for better collaboration is a very actual problem. ABM is a computational approach to model EAI Endorsed Transactions on Smart Cities 06 2018 -02 2020 | Volume 4 | Issue 9 | e1 complex systems with numbers of agents such as oil and gas industries. It consists of modelers which define other parts with their behaviors and generate simulations of their interactions to identify how those sections influence the behavior of entire system. The main objective of this paper is to propose a framework using big data technologies to formulate principles for collaboration, develop tools for incorporation of all data into a single set and to visualize data for further analysis. To do that we need to investigate the relationships of different variables in oilfield data in risk and then develop a simulation to identify higher risks of this business. Oil & Gas organizations could employ analytics to optimize materials' delivery, reducing inventory levels and supply chain costs to evaluate performance and optimize supply chain operations. One of the most flexible modelling methods is agent-based modelling. The basis for naming these methods is because agents play an essential role in the model. In this type of modelling, each of the real world agents is modelled as decision-making and fully automated entities, called agent. And repetitive competitive communication between agents and subagents are a feature of agent-based modelling which relies on the power of hardware to explore dynamics out of reach of pure mathematical methods. Each of these factors has various sections for understanding the environment, analysing it, and ultimately acting. In fact, in modelling the underlying factor, I will attempt to simulate the decision making process in the real world by similar factors. In order to apply agent-based modelling, following procedure is essential:

3-1 Data Collection:
First and most required data (major terminals, tanks, connectivity, commodity types, Pools, defaults routes, business rules, batches in pipelines at any time, to name but a few) could be collected from online pages but the only factor in collecting data is the volume of information, complexity of the system as whole. Secondly, detailed structured questionnaires could be designed to identify the way in which oil marketing companies manage their supply chain. And finally, data would be obtained through literature review of various publications of supply chain management and production and operations management which are related to Oil and Gas industry.

3-2 Model the Process:
One of the most flexible modelling methods is agentbased modelling. The basis for naming these methods is because agents play an essential role in the model. In this type of modelling, each of the real world agents is modelled as decision-making and fully automated entities, called agent. Each of these factors has various sections for understanding the environment, analysing it, and ultimately acting. In fact, in modelling the underlying factor, we try to simulate the decision making process in the real world by similar factors. All agents and subagents behaviour which are associated with Oil & Gas supply chain industry need to be designed and the challenges should be simulated. To analyse existing supply chain models and develop a risk model that will be used to categorize and derive the various types of risks from analyzing the impacts of prior events on the oil and gas supply chain and subsequently derive their ratings from the weighted risk. Developing a risk based SCM that will include a risk based network reliability analysis model using the modified minimum cut-set method to locate critical links/nodes in the network. Also failure from some of events, activities and threats we are able to analyse their risk ratings and severely affect the supply chain networks. A risk based Linear Programming (LP) Supply Chain Model (SCM) for strategic and tactical planning in the oil and gas SCM, by using the risk ratings obtained above to simulate different scenarios and alternatives, so as to get and incorporate the likely impact of these events and activities on the critical, links and nodes. Finally, develop a Fault Tree and Model Based Vulnerability Analysis (FTA and MBVA) models that will be used to show how scarce resources can be allocated for optimum result in hardening/protecting these oil and gas SC nodes from failure because of the likely impacts of some of the events and threats analysed.
However, agent-based models can also generate large amounts of data, which can be difficult to analyse and understand. Hence the question arises whether agentbased models could be combined with Big Data methods in a way that helps address this problem.

4.How big data integrity Oil and Gas supply chain:
Oil and gas companies can analyse data streams from suppliers to evaluate performance and optimize supply chain operations. They can use analytics to improve "Real-time" delivery of materials, reducing inventory levels and supply chain costs. In brief, this research project proposes to develop a framework(figure-1) to solve real-time big data management, storage, computation challenges, and predictive data analytics in oil & Gas supply chain organization in order to predict and monitor different variables and Customers' behaviour.
To deal with collecting real-time streaming data which is come from different agents, we need to use state-of-art technology such as Apache Kafka and Flume [10] as a distributed messaging system to collect unstructured and semi-structured data. It is unrealistic to expect that data will be perfect after they have been extracted. Before processing data, they should be go through different step s which is called "Data Cleaning". Data is cleaned through Oil and Gas supply chain optimization using Agent-based modelling(ABM) integration with Big Data technology EAI Endorsed Transactions on Smart Cities 06 2018 -02 2020 | Volume 4 | Issue 9 | e1 4 processes such as filling in missing values, smoothing the noisy data, or resolving the inconsistencies in the data. Raw data need to be pre-processed in different steps including data-integration, data transformation, data reduction, and data discretization. Since good models usually need good data, a thorough cleansing of the data is an important step to improve the quality of data mining methods. Not only the correctness, but also the consistency of values is important. Data preparation consists of techniques such as Min-Max normalization and standardizing or z-scoring or normalizing the data [11]. In many application areas, datasets can have a very large number of features. As the dimensionality of the feature space increases, many types of data analysis and classification become significantly harder, and, additionally, the data becomes increasingly sparse in the space it occupies which can lead to big difficulties for both supervised and unsupervised learning.
Cleaned data are delivered to spark streaming which is a distributed stream processing engine. In this stage, spark streaming breaks up the input data stream in small batches namely Resilient Distributed Datasets (RDD). A continuous sequence of RDDs pass through spark engine in order to be processed and could be used in machine learning libraries such as MLlib for data analysis [12]. Different clustering and classification algorithms could be applied to find disease pattern or predict risks. These analysis methods could help care sector to identify knowledge in order to predict various risk in real-time. In this research project, there is a lot of potential in delivering more targeted, wide-reaching, and costefficient healthcare by exploiting big data trends and technologies.

Figure 1-Big data analytics in optimizing Oil & Gas Supply Chain
Analytics is of course a very wide area; we would like to focus in this section on a technology that has not been implemented widely in supply chain until recently called Machine Learning and in particular how it can be combined with optimization to produce breakthrough results. A natural application of supervised machine learning in supply chain analytics is forecasting. They are poised to address important issues in areas such as capacity planning due to uncertainty in downstream capacities, inventory and supply-chain management by reducing uncertainities around material and part availabilities, and by reacting to (or anticipating) market and customer demand changes.

Conclusion
Supply chain management could play a vital role in order to promote business profitability and decrease costs in every industry. Recently, by emerging new generation of hardware(radio-frequency identification (RFID), different sensors, internet of thing(IoT), tracking technologies, etc.) and new algorithm and its applications, managing supply chain of businesses could be much easier than before. Due to the globalization, Oil and Gas supply chain has turned into one of the challenging industry as the number of agents and sun-agent involved in this branch, could generate huge amount of data. Data-storage, process and management are critical concern in this era. In this work, agent-based modelling and big data integration have been demonstrated as one of the solution which could provide data availability, scalability and performance for this system. In the future research, collected data would be examined by implementing the architecture and the result will be published.