On the Performance of General Cache Networks

The performance evaluation of cache networks has gain a huge attention due to content-oriented delivery technologies. If general network topologies are more realistic than hierarchical networks widely studied in the literature, their analysis is significantly challenging. Existing models mainly focus on trees where content custodians are located at the root and the one-way child-to-parent request forwarding schema is common. In this paper, we consider complex and irregular networks where requests may flow possibly in opposite directions from/to several sources/destinations. Moreover , we assume that caches may run one of Time-To-Live (TTL)-based policies recently introduced for content-centric networks and modern Domain Name System [5]. We then derive an analytical framework and a polynomial-time algorithm that approximate accurately performance metrics of arbitrary graph-based and heterogeneous TTL-based cache networks. Simulations show that our simplified methodology may accurately predict metrics of interest on networks of caches running popular replacement algorithms (e.g. LRU, FIFO, or Random) without restricting its scope of application to this interesting use case. Unlike existing approaches, ours scales as network and content catalog sizes increase.


INTRODUCTION
Over the past years, isolated data repositories running popular cache replacement algorithms have received significant attention.Nowadays, interest has shifted from caching systems in isolation to interconnected caches.The benefits of the latter approach come from storing contents in caches that are universally deployed or distributed across the network.This trend is mostly driven by the recent development of content-oriented technologies [1,12].The purpose is to adapt the network architecture to accommodate the current content usage patterns (Video-on-Demand, User-generated contents, etc.) with the potential to reduce congestion, improve content delivery speed as networks increase in size.
Network performance goals can be achieved in different ways.We focus on content distribution technologies [1] which allow on-demand caching through deployment of caches at various locations.Briefly speaking, when a data item is first requested, it is temporarily stored at some nodes and subsequent requests are served directly from these local copies.Therefore, users may experience a better quality of service.Obviously, the benefits of such in-network caching choices are highly tied to the performance of the underlying cache networks.These performance metrics can be significantly challenging to predict since the multi-cache systems that arise from content-oriented architectures are more complex, irregular, and heterogeneous.More precisely, these networks may have arbitrary topologies and their nodes may show different features (policies, capacities, routing tables, etc.).
When requests of all files share the same routing topology with content custodians of all files located at the same root nodes, bidirectional flows as described in Fig. 1 cannot occur.In this very specific case, requests flow in the same direction from child nodes to parents towards the root(s) and the network topology reduces to all nodes belonging to this routing topology or directed acyclic graph.We refer to this case as the unique routing topology model.Instances of this routing model have been studied in the literature on networks with linear [14], tree [4,13], polytree [6,15], feedforward topologies [3,5,8].However, their approaches rely heavily on the unique routing topology model which is here a tree/hierarchical structure and the one-way child-to-parent forwarding schema of the network.This strong requirement makes their models not applicable to more general cases.Unlike, Algorithm 1 presented in Section 4 do not suffer from this limitations.
As reported in [12], only a-NET model, introduced in [16], addressed so far the performance of the tandem of two caches shown in Fig. 1 where requests are routed in opposite directions.This basic network is the atom of arbitrary interconnected caches.Unfortunately, [16,12] report relative errors of 15% and find that the violation of the Independent Reference Model (a.k.a.IRM or Poisson assumption) on miss streams of caches and aggregated request processes in the network is the major source of inaccuracy.
In this paper, we develop an analytical framework to assess performance metrics of cache networks by leveraging the concept of Time-To-Live (TTL)-based models recently introduced in [5] on the one hand.On the other hand, we apply two-moment matching techniques to fit hyper/shifted exponential renewal processes to general request processes [17].Our main results are • an analytical and extensible framework that provides computationally efficient two-moment approximations of filtered, split, and merged request streams in the network; • a polynomial-time algorithm to approximate metrics of interest on large and heterogeneous graph-based networks; • Little's Law for caching systems and bounds of the TTLbased cache characteristic time are derived; • event-driven simulation results showing that our framework is more accurate than the existing model [16].This paper is organized as follows.In Section 2, we describe the system under analysis and state our problem.In Section 3, we derive a Little's Law-like and characteristic equations enabling caching systems to be described via TTLbased models.We then present our algorithm to calculate characteristic times of isolated caches and our two-moment matching techniques to characterize filtered, split, and aggregated request streams.We describe our polynomial-time algorithm for network analysis in Section 4, show its accuracy in Section 5, and summarize our findings in Section 6.

MODEL
In this section, we state our problem on our toy network (Fig. 1).Then, we introduce the notation and assumptions that are used in our cache network model.

Problem formulation
Two sets of files are requested on the tandem of two caches of Fig. 1.User un requests the Kn files stored permanently at server sn.User u1's missed requests at cache 2 are routed towards server s1 through cache 1.The overall request arrival process at cache 1 is formed by (well-known) exogenous requests coming from user u2 and unknown miss processes of cache 2 and vice versa.Consequently, states of both caches become dependent.This would not occur if both servers were located at cache 2 as considered in [3,4,6,13,14,15].Therefore, these existing models are limited since overall request processes at both caches are unknown and the strong assumptions (i.e.tree-based structure with content custodians at located at the root and the one-way child-to-parent request forwarding schema in the network) required by their cache-by-cache iterative models do not hold.

Network model
Let G = (V, E) be the graph representing a general and heterogeneous cache network, V = {v1, . . ., vN } the set of caches, and E ⊆ V × V the set of connections between caches.Additionally, the file catalog is denoted by F = {f1, . . ., fK }.Each file is stored permanently at one or more public servers attached to nodes in the network and there is at least one path between each pair of cache-server.Requests arrive exogenously and directly from users to some of the nodes {vn ∈ V } in this system.When a request for file fi arrives at a cache, it generates a cache hit if the file is located in the cache and a miss otherwise.In the latter case, the request is forwarded to other caches in the network based on the routing table at each cache, until the file is located in a cache or at the server storing the file.Then the file is forwarded along the reverse path taken by the request, and stored at each cache along the way.If the cache is full when a miss occurs, one of the files in the cache is selected based on an eviction policy to make room for the new file.

Workload model
We denote by Rn,i = {t k (n, i)} k≥0 the overall request process of file fi at cache vn where t k (n, i) is the arrival instant of the k+1-th request.Rn,i is formed by the superposition of an exogenous request process En,i (generated by local users of cache n), if any, and the endogenous request processes (generated by misses of other caches connected to cache n).Let λn,i be the rate of exogenous arrivals at cache vn, if any, and Λn,i the intensity of Rn,i.In this paper, we assume that Assumption 1 (Renewal).Exogenous request processes {En,i, ∀n, i} are renewal processes.Moreover, {En,i, ∀i} are independent at cache n.
At cache vn and for file fi, we denote by Xn,i the generic inter-arrival time of request in the process Rn,i, Fn,i(t) = P (Xn,i < t) its Cumulative Distribution Function (CDF), Fn,i(t) the CDF of its survival time, Nn,i(t) the counting process, Rn,i(t) = E[Nn,i(t)] the Renewal Function associated to Fn,i(t), and F * n,i (s) = E[e −sX n,i ] its Laplace-Stieltjes Transform (LST) for all t ≥ 0 and s ≥ 0.

Cache model
In this work, nodes of our network are endowed with a cache running a replacement algorithm that can be described with Renewing or Non-renewing TTL-based models introduced in [6] and [15] respectively.The renewing TTL-based model assigns a random value to the timer T1 of a file at cache miss instant t0 and later redraws this timer T k at each cache hit instant t k−1 as shown in Fig. 2(a).The non-renewing TTLbased model sets a random value to the timer T1 only at cache miss instant t0 as shown in Fig. 2(b).TTL-based models are used to describe space-driven policies (like LRU as shown in Fig. 2(a), FIFO/RND as shown in Fig. 2(b) [4,10,13]), time-based policies (like DNS [11], modern-DNS [15]), and space-time policies (like Pra-TTL caches [6], Amazon ElastiCache and Squid web cache [3]).

Other considerations
As previous work, we are interested by the hit probability Hn,i (resp.the occupancy On,i), defined as the probability that file fi is in cache vn at request instants {t k (n, i)} k≥0 (resp.at any time instant t).The global (or average) hit  probability Hn and the Miss Probability Ratio MPRn between predictions of our approximation and simulation results at cache vn are also calculated [16].
We assume that all files have the same size [16] or can be divided in small chunks of identical size [10].Hence we express the cache size in terms of the number of files/chunks it can hold at any given moment.We also assume that files become available at cache vn once at miss instants {m n,l , l ≥ 0}.

SINGLE CACHE FRAMEWORK
In this section, we derive new exact results on caches.We then revisit the notion of characteristic time, generalize the concept to caching systems described by TTL-based models, and address operations that modify request streams.

On the steady state of isolated caches
For readability, we omit the subscript n that refers to the cache label.Let Ii(t) be a binary random variable indicating that file fi is in the cache at time t.Assume the limit Ii = lim t↑∞ Ii(t) exists.Since there is a copy of fi or not in the cache at any time depending on whether Ii = 1 or 0, the occupancy of file fi in the cache is calculated as it follows Oi = P (Ii = 1) = E[Ii].Note that Oi denotes the expected number of copies of file i in the cache.Let I = K i=1 Ii be the total number of files in the cache at any time and Qi the sojourn time of file fi.For all caching systems, metrics of interest are related to the sojourn time of the file in the cache as follows.
Lemma 1 (Little's Law).For caching systems, the metrics of interest and expected sojourn time are related by Proof.By applying Little's law, (1) is established as follows.First, we note that the expected number of copies for file fi in a cache is Oi.Second, the rate at which file fi enters the cache is the miss rate Λi × (1 − Hi).Third, the expected time that file fi spends in the cache is E[Qi].
Unlike capacity-driven policies, the number of files cached using expiration-based policies is not bounded in principle.However, practical considerations require to not exceed a certain memory occupation level C. The following result provides conditions for the latter to hold.
Lemma 2 (Characteristic Equations).For caching systems, the total number of files in the cache at steady state I equals the cache capacity C almost surely if and only if Proof.Assuming (2), we show that By calculating the first two moments of I with the condition {|I − C| ≥ η}, one can easily show that (2) holds necessarily .
Remark 1 (Che approximation [4,13]).For capacitydriven policies (such as FIFO, LRU, RND, and variants [13]), the total number of files in the cache at steady state is constant I = C and K i=1 Oi = C is necessary and sufficient.
For general TTL models, there are several ways to choose per-file timer distributions {Ti(t) = P (Ti < t), ∀fi} such that (2) holds.Here are three possible choices: • Cache Characteristic Distribution.Here, files have the same TTL distribution Ti(t) = T (t), ∀fi characterized its mean T = E[Ti], ∀fi solution of (2) with p = 1.
Our cache characteristic distribution for general TTL-based models is consistent with Che's approximation [4,10,13] for capacity-driven policies where all files in the cache have (approximately) the same TTL distribution T (t) which depends on its mean only.[13] showed ingeniously that T (t) is approximately deterministic for LRU/FIFO caches (i.e.Ti ≈ E[Ti] ≈ T, ∀fi) and exponential for RND caches (i.e.Ti is exponentially distributed with mean E[Ti] ≈ 1/T, ∀fi).
• Cache Characteristic Moments.In this case, per-file TTL distributions are no more identical as in the previous case, but they have identical first two moments E[Ti] = T and Var[Ti] = σ 2 , ∀fi solutions of (2) with p = 1 and 2.
• Cache Characteristic Time.This case is stated as Assumption 2 (Characteristic Time).TTLs have the same mean E[Ti] = T , ∀fi and (2) holds with p = 1.
Assumption 2 allows two different files, say i and j, to be cached using two different TTL distributions Ti(t) and Tj(t).Given that only E[Ti] = T for all files fi, per-file optimal TTL distributions can be set according to per-file request processes as proved in [9,Prop. 3.4] and [15,Prop. 4].Assumption 2 is more general than Che's approximation (which is only a particular capacity-constrained TTL-based model with identically distributed file TTLs).In general, the characteristic time T in TTL-based cache models may be defined even with different per-file TTL distributions {Ti(t), ∀fi}.Its value T is bounded as follows.
Proposition 1 (Bounds).Under Assumption 2, the characteristic time T = E[Ti], ∀fi is bounded by where is the average miss probability and Λ = K i=1 Λi is the total rate on the cache.
Proof.From their definitions, Qi and Ti are stochastically ordered as follows Qi ≥ Ti.By taking the expectation on the latter inequality and then replacing E[Ti] by the same constant T , we obtain E[Qi] ≥ T, ∀fi.Using the latter inequality in Lemma 1, we obtain Oi ≥ Λi(1 − Hi)T, ∀fi.Since ΛiE [Ti] is the expected number of copies of file fi requested within the time interval E[Ti] = T and Oi is the expected number of copies of file fi in the cache, it follows that Oi ≤ ΛiT, ∀fi.Applying (2) in Lemma 2 with p = 1, bounds of T are obtained as in (3).
Remark 2 (Stationary requests).Proofs of Lemmas 1 and 2 and Proposition 1 do not require Assumption 1, but stationarity and ergodicity of request processes only.

Metrics of interest and characteristic time
In this section, we present metrics of interest and our characteristic time approximation (CTA) on isolated caches.We consider that request streams are described by hyper or shifted-exponential renewal processes based on the value of the square coefficient of variation (c 2 v ) of inter-request times [17].When It is known from [6,15] that metrics of interest of file fi in non-renewing (resp.renewing) TTL-based cache models are given by Hi (resp.Oi = E Fi(Ti) ).We derive explicit formulas for hyper/shifted-exponential renewal request processes in our technical report [7,Sect.4].
The value of T is approximated using a novel technique described in Algorithm 1.Its convergence on isolated caches is studied as follows.Thanks to Lemma 2, the characteristic time T = E[Ti], ∀fi and TTLs {Ti} satisfy for non-renewing and renewing TTL-based models respectively.Under Assumption 2, we apply Jensen's inequality and show that for non-renewing TTL models since Ri(t) is a concave function when Fi(t) is concave.It is easy to check that the hyper/shifted-exponential CDFs are indeed concave functions.Since (4) holds, it follows that For renewing TTL-based models, we rely on similar arguments.Thanks to Assumption 2 and Jensen's inequality, K i=1 Oi ≤ K i=1 Fi(T ) since Fi(t) is a concave function.However, it follows from (4) that ) in the case of non-renewing (resp.renewing) TTL based models.The function φ(T ) is strictly decreasing, twice continuously differentiable, and the root is unique and simple (multiplicity is one).Thanks to Proposition 1, the zero of φ(T ) belongs to [ C Λ ; ∞).Algorithm 1 implements the secant method to approximate the zero of φ(T ) with the sequence: Λ and T (I+1) = T (I) + T (I) −0 φ(T (I) )−φ(0) φ(T (I) ).Since T (0) is tight lower bound (and thus a good estimate of T ) and φ(T ) is strictly decreasing (and not wiggly) in [T (0) ; ∞), it is known that the convergence of the secant method (and thus Algorithm 1) is superlinear with an order of convergence equal to the golden ratio 1+ √ 5 2 .
Remark 3 (Quadratic convergence).Algorithm 1 converges quadratically if Newton's method is implemented.This result holds also for general and concave CDF Fi(t).

Cache network operations
Exact characterizations of miss, split, and aggregated request processes exist, but they are too complex and computationally expensive for practical interest [3,5,6,14,15].Hence, for each cache operation we calculate only the first two moments of inter-request times in the resulting process and we use a two-moment matching technique to fit them to hyper/shifted-exponential renewal processes [17].

Filtered request streams (Miss process)
For non-renewing (resp.renewing) TTL-based cache models, the two first moments of inter-miss times of file fi are .) The LSTs L * i (s) take simple expressions when requests are described by hyper/shifted exponential renewal processes previously introduced.One can easily check that our closed-form expressions of E[Yi] and E[Y 2  i ] are consistent with the formulas derived by Melazzi et al. [14] in the specific case of where TTLs are constant (and not random variables as generally assumed in this paper) and always redrawn at cache hit (and thus limited to the model described in Fig. 2(a).)

Split request streams (Routing departures)
In this section, we consider that a cache may route its missed requests towards J possible caches or destinations.Our aim is to describe (i.e.calculated the two first moments of) the sequence of requests that are forwarded to the jth destination.Existing works [16,9,15,3] consider that this request forwarding operation is memoryless and they model the splitting process by a Bernoulli process.However, several destinations are often allowed within more realistic networks to enable load balancing or congestion awareness based on a recent history.We therefore propose to model this request splitting operation by using a discrete time Markov chain process {ξm l , l ≥ 0} on a state space {1, 2, . . ., J} such that requests occurring at miss instants {m l , l ≥ 0} are sent to the j-th destination if ξm l = j.Further, we assume that the Markov chain {ξm l , l ≥ 0} is lumpable and without loss of generality we characterize the process described requests sent to destination (or next-cache) labelled 1.First, the J states of {ξm l , l ≥ 0} are lumped into two states I = {1} and O = {2, . . ., J}.Second, we consider the resulting two-state Markov chain {ξm l = I or O, l ≥ 0} and its transition probabilities rI,I = P (ξm l+1 = I|ξm l = I) and rO,I = P (ξm l+1 = I|ξm l = O)).Its stationary distribution is rI = P (ξm l = I) = r O,I 1+r O,I −r I,I .The first two moments of inter-request times Yi→1 of file fi at destination 1 are related to those of the original process.

Aggregated request streams (Routing arrivals)
In this section, we consider that a cache may be fed by requests originating from several sources (e.g.exogenous requests sent by its local users, and the endogenous missed requests forwarded by other caches).Our goal is to characterize the two first moments of inter-request times in the overall arrival process.It is known from [4,6,15,3] that the exact characterization of this aggregated process may be significantly complex and [16,13] showed that the Poisson assumption is source of inaccuracy.Hereafter, we describe a computationally-efficient and accurate methodology to approximate this superposed process.Without loss of generality, we consider that requests arrive at a cache from J sources labelled j = 1, 2, . . ., J. Let E[Xi,j] and E X 2 i,j be the first and second moment of inter-request times of source j.If the calculation of the first moment of interarrival times E[Xi] of requests for file fi is straightforward and the per-file weight wi is Theoretical details of our aggregated process approximation are provided in our technical report [7,Sect.5.1.3].

CACHE NETWORK APPROXIMATION
In this section, we extend our single cache framework to approximate performance metrics of heterogeneous cache networks with arbitrary topology.First, we define the notion of routing topology.We then describe our cache network algorithm (CNA) to handle general networks where bidirectional flows may occur in presence of multiple routing topologies.

Case of multiple routing topologies
The routing topology of a file refers to the subset of nodes that receive/exchange requests of that file.In this paper, we consider that requests of each file are routed as a Directed Acyclic Graph (DAG) built on top of the arbitrary network topology.In the following, the terms DAG and routing topology are used equivalently.
In Fig. 3 for instance, two classes F1 and F2 of content items are shown in green and blue respectively.Content items of F1 are stored permanently at the server connected to node v5 and their requests are routed on the tree (displayed in blue) while those of class F2 are permanently available at two different servers and their requests are routed on the polytree (displayed in green).Requests of items of classes F1 and F2 are moving in opposite directions (or equivalently it exists a bidirectional flow) between nodes v3 and v4.Hence, states of caches 3 and 4 are dependent.Their TTLs T3 and T4 are coupled and solutions of a system of equations (Cf.Lemma 2).Instances of such systems of equations are available in our technical report [7, Eq.( 15)] for more details.
Our algorithm to tackle this issue is described as follows.
• Input: network topology G(V, E) of size N = |V | , cache neighbours N (n) ⊆ V , policies Pn and sizes Cn, file catalogue F of size K = |F|, routing topologies {DAGi, fi ∈ F }, exogenous requests {rate λn,i and scv c 2 n,i , vn ∈ V , fi ∈ F }. • Output: characteristic time Tn, average miss probability M n, per-file metrics of interest {hit probability Hn,i and occupancy On,i}, per-file aggregated request process {rate Λn,i and scv c 2 X,n,i }, per-file miss request process {rate νn,i and scv c 2 Y,n,i } at each cache vn ∈ V .• Procedure: Algorithm 1 starts with an initialization step where all caches are assumed to have miss probabilities of one.The consequence is that the miss process of a node is initialized by its aggregated arrival processes.Then the characteristic times are also initialized to their lower bound.After, this initialization step, Algorithm 1 updates the miss processes of the caches using the initial value of the TTLs.Then it recalculates the aggregated streams; finally, it updates the TTL values at each cache.Algorithm 1 halts when all TTL values at all nodes of the network have converged.

Practical concerns on Algorithm 1
In this section, we show the convergence of our cache network algorithm, its polynomial-time complexity, and its properties for large scale implementations.On the convergence.Intuitively, Algorithm 1 converges since sequences of TTL values are increasing and bounded (see Proposition 1).More formally, Algorithm 1 finds the root T = (T1, . . ., TN ) of the system of equations Φ ( τ ) = (Φ1 ( τ ) , . . ., ΦN ( τ )) = 0, where Φ(.) : R N → R N , τ = (τ1, . . ., τN ) is the characteristic time vector, and Φn( τ Fn,i(τn)) for non-renewing (resp.renewing) TTL based models.Note that the dependence of Φn( τ ) on the characteristic times τm, (m = n, vm ∈ V ) is strongly embedded in the aggregated request arrival process at cache vn via the CDFs {Fn,i(.)}.Since ∇Φn( τ ) < 0 at ∀ τ > 0 and ∂ 2 Φn( τ ) n,i ; T where T (I−1) is the characteristic vector at the previous iteration.Hence, Algorithm 1 converges superlinearly in each component Φn( τ ).The convergence is quadratic if Newton's method is used instead.On the complexity.Instructions in bold font are basic operations of Algorithm 1.Its complexity is of order of O(N IK) where I the total number of iterations is bounded by I ≤ ∆ 2 G with ∆G the diameter of the network topology.The equality holds i.e.I = N 2 in the (worst) case of the tandem network of N nodes (similar to Fig. 1) with bidirectional flows through all nodes.In the case of unique routing topology model which is widely studied in the literature (e.g.networks with linear [14], tree [4,13], polytree [6,15], feed-forward topologies [3]), Algorithm 1 needs only I = ∆G iterations.On the implementation.Approximations [4,14,13,16] are limited by the size and tree-based topology of networks.Unlike, Algorithm 1 scales easily as the network and catalog sizes increase.It can be implemented in distributed MapReduce jobs of Apache Giraph software [2].During the initialization phase, each job associated to a node reads the network and routing configurations.It then stores labels of its neighbours and identifiers of files it may receive.During each super-step, each job executes once the while-loop of Algorithm 1 and exchanges parameters of its node request streams and characteristic time at the end.All jobs halt if the root mean squared error of TTLs is less than a given threshold.

EVALUATION RESULTS
In this section, we evaluate the accuracy of our model by comparing its predictions of per-file hit probabilities on networks where caches are running space-driven policies like Least Recently Used (LRU), Random replacement (RND), and First-In-First-Out (FIFO).Indeed, the existence of deterministic (resp.exponentially distributed) characteristic times for LRU and FIFO (resp.RND) policies (see [4,10] and [13] for more details) allows us to describe them with TTL-based models as shown in Figure 2. We consider as "exact" the metrics of interest obtained via long event-driven simulations (≈ 16.77 × 10 6 events generated).Files exogenous requests arrive at some nodes of the network with rate Λ = 1 and Zipf popularity of parameter α = 0.7.

Characteristic time approximation (cta)
TTL models of isolated LRU, FIFO, and RND caches produce accurate predictions of per-file hit probabilities [4,13,14].Hence, our goal here is to validate the accuracy of Algorithm 1 when approximating cache characteristic times.We consider caches of size C = 100, a catalogue of size K = 10 5 , and Poisson request traffic with total request rate Λ = 1.The characteristic times or TTLs T cta POLICY where POLICY is either LRU, RND or FIFO are calculated by solving (2) with p = 1 of Lemma 2. Meanwhile, their approximate values T

(n)
POLICY are given by Algorithm 1. Figure 4 shows that our algorithm converges after one iteration starting from the lower bound T (0) = C/Λ to the cache characteristic times.This convergence was also observed with Interrupted Poisson Processes (IPPs) of same rate Λ = 1 and scv of inter-request times c 2 v = 1.5.
Thanks to these preliminary experiments, we made two interesting observations not yet reported in the literature.
For both Poisson and IPP request models, we observed that characteristic times of FIFO and RND caches are approximatively equal.Moreover, miss streams of FIFO (resp.RND) caches fed by Poisson traffic are accurately described by shifted-exponential renewal processes.

Performance metrics on cache networks
In this section, exogenous request streams are still described by Poisson processes.Moreover, we call Poisson approximation the specialization of Algorithm 1 when considering the first moment of inter-request times only (and setting wi = 0, ∀fi in ( 5)).Another specialization, that we call the Whitt approximation, is obtained when wi = 1, ∀fi in (5).Finally, our general methodology when wi ∈ [0, 1] in (5) as calculated in (6) will be the Hybrid approximation.These approximations were evaluated extensively under various network configurations used for a-NET model [16] and our Hybrid approximation show better accuracy than others.Due to lack of space, we only report part of results for some metrics of interest.Additional numerical results can be found in our technical report [7, Sect.6.2 & 6.3].

Poisson approximation and a-NET model
The Poisson approximation and IRM traffic model used by a-NET model [16] can be seen as assuming that all request streams are described by Poisson processes at each node of the network.We evaluate our Poisson approximation and a-NET model on linear, binary tree, and random (i.e links are drawn uniformly at random) networks of LRU caches.For these experiments, we consider that Poisson requests arrive exogenously at each cache.Due to lack of space we only report per-file hit probabilities for the random network at the node directly connected to the servers.On all simulated cache networks (Cf.[7,Sect.6.1]) and in particular for the random network (see Fig. 5), we observed that the Poisson approximation is as accurate as a-NET model [16].Hence, only Poisson approximation is used later in this section on other network configurations for comparison purposes.

Accuracy of our Hybrid approximation
Large tree networks.We compare the Poisson, Whitt, and Hybrid approximations on homogeneous and heterogeneous tree cache networks where requests occur on leaf nodes only.In the former case, we consider a ternary tree network of depth seven having 1093 LRU caches; and, a 4-ary tree of depth five having 341 caches with capacities chosen uniformly at random within the interval [50, 150] and replacement policies are selected among the FIFO, RND, and LRU policies in the latter case.The catalogue size is K = 10 4 .In both cases (see Figs 6 and 7), Poisson approximation overestimates the metrics of interest while Whitt approximation underestimates them.However, the Hybrid approximation outperforms others for predicting per-file hit probabilities and produces miss probability ratios close to one.Other network settings such as random networks where requests arrive on edge nodes only can be found in [7,.
Large random networks.Finally, we consider a large random network of 100 LRU caches with capacities Cn ∈ [50, 150] and a catalogue of size K = 10 4 .We consider exogenous Poisson request streams at all caches.As shown in Fig. 8, our Hybrid approximation is still accurate.

CONCLUSION
Performance analysis of general and heterogeneous cache networks was the main goal of this work.Relying on TTLbased models, we developed an analytical framework and an accurate polynomial-time algorithm to address this problem in a quite general scope since existing models were limited either by the IRM assumption, tree topology, one-way

Figure 1 :
Figure 1: Tandem of two caches with bidirectional flows

Figure 2 :
Figure 2: Behaviour of a file in the cache.

∂τ 2 nAlgorithm 1 : 2 7 Set aggregate streams to exogenous process 8 else 9 Merge 11 Set miss streams to aggregated processes 12 Split
> 0, Algorithm 1 calculates updates of the characteristic time T (I) n at cache CNA on arbitrary graph G(V, E) for each node vn ∈ V do 3 Initialize per-file miss probabilities to one 4 Initialize per-file miss and aggregate processes 5 for each file fi ∈ F such that vn ∈ DAGi do 6 if vn is an edge of DAGi then the file request streams (Sect.3.3.3)if multiple sources: ∀v l ∈ DAGi & vn ∈ N (l) 10 end the file miss process (Sect.3.3.2) if multiple destinations: ∀vm ∈ DAGi ∩ N (n)