Specifying and Monitoring Properties of Stochastic Spatio-Temporal Systems in Signal Temporal Logic

We present an extension of the linear time, time-bounded, Signal Temporal Logic to describe spatio-temporal properties. We consider a discrete location/ patch-based representation of space, with a population of interacting agents evolving in each location and with agents migrating from one patch to another one. We provide both a boolean and a quantitative semantics to this logic. We then present monitoring algorithms to check the validity of a formula, or to compute its satisfaction score, over a spatio-temporal trace, exploititng these routines to do statistical model checking of stochastic models. We illustrate the logic at work on an epidemic example, looking at the diffusion of a cholera infection among communities living along a river.


INTRODUCTION
Many of the scientific and technological challenges we are currently facing deal with the complexity arising from the non-linear interactions between heterogeneous components.Examples span from biological systems, from bacteria up in the ladder of life, to ecological interactions, to socio-technical and cyberphysical systems, in which digital sensors and devices interact with the natural and social environment.Our ability to understand, control and design such systems requires refined mathematical and computational modelling tools.
Formal methods offer a powerful framework in this sense, providing modelling languages, requirements and properties specification languages, and efficient analysis routines.In particular, logical methods based on temporal logics have proved to be very effective both in studying properties of existing systems, like biological ones [21], and in the design of new systems [4].This is due to their ability to easily express complex temporal behavioural patterns, and the availability of efficient model checking and monitoring tools, for many classes of mathematical models, ranging from ODEs [24] to stochastic processes [3,4].
The literature of applications of formal methods to complex systems and temporal logic in particular, has been mainly focused on well-mixed systems, where the spatial nature of interactions is abstracted for the sake of simplicity.However, spatial information is often crucial to properly understand their emerging dynamics.For instance, in bacteria chemotaxis, the motility of cells is realised by the spatial polarisation of small-G-protease signalling proteins [23].Epidemic spreading at the national or worldwide scale depends crucially on the asymmetric population distribution and on the mobility patterns and available air routes [27].The design and deployment of a bike sharing system cannot neglect the spatial distribution of bike stations, reflecting into different timeinhomogeneous request trends [18].All these systems have inherent spatial aspects: they involve a large number of heterogeneous spatial entities that are located and can move on a physical space.
While from the point of view of modelling languages and simulation, some relevant work in spatial modelling has been done in recent years [9,17], the use of modal logic to describe and expecially validate spatio-temporal properties is much less developed.While spatial and spatio-temporal logics have been proposed and theoretically investigated in literature [1], model checking routines have a much more recent history [11,12,20].In particular, we are not aware of any linear-time spatio-temporal logic with monitoring or model checking routines for checking properties of spatially located differential equations (PDE or patch ODEs) or of spatially located stochastic processes [12].
Here we begin tackling this problem from a practical perspective.We start by introducing Signal Spatio-Temporal Logic (SSTL), a spatial extension of Signal Temporal Logic (STL) [14,24].STL is a temporal logic suitable to specify behaviours of real-valued time series generated during the simulation of a (stochastic) dynamical system.It extends the dense-time semantics of Metric Interval Temporal Logic [2] (MITL), parameterising it by a set of numerical predicates playing the role of atomic propositions.A special characteristic of STL is the definition of a quantitative semantics, allowing us to quantify the satisfiability of a property, i.e. to give a value expressing how robustly a property is satisfied.
We extend STL by introducing simple spatial operators, inspired from the modal spatial operators of the Multiprocess Network Logic [26].They permit to describe properties as "there exist a location j at a certain distance from location i where the properties ϕ is satisfied", or "all the locations with a distance no more than d from satisfy the property ϕ".The semantics of such spatial operators is strictly related with the description of space in which the dynamics take place.Space can be logical, i.e. a set of locations, it can be discrete, i.e. a grid or a more general general graph, or it can be continuous, for instance the euclidian space.The first option that we decided to investigate are systems with a discrete space, structured as a weighted graph, called patch-based quantitative population models.These models describe the number of entities in each node, with agents located in nodes and moving from a node to another one.
We first present in detail the boolean and quantitative semantics for a single trace, extending the monitoring algorithms to actually compute the satisfaction over boolean or real-valued signals.We then consider the stochastic extension of this logic, leveraging the monitoring algorithms in a stochastic model checking routine.Finally, we show the logic at work on a model of the spreading of cholera, a waterborne disease, considering the effect of rivers in the diffusion of the epidemics [7,25].
The paper is structured as follows: in Section 2 we introduce the background material, while in section 3 we define the Patch-based population model.In Section 4 we describe the syntax and the semantics of the Signal Spatio-Temporal Logic (SSTL), sketching the monitoring algorithms In Section 5.In Section 6 we present the stochastic extension of SSTL and in Section 7 we discuss the case study and some experimental results.Finally, the conclusions and future works are drawn in Section 8.

BACKGROUND
In this section we give the definition of population models and we present two different types of modelling approaches.

Population Model
A population model intuitively is a system in which a large number of different agents or components interact together and assume, through local transitions, a number of different states.The transitions can be seen as descriptions of events changing the global state of the system.There are many example of population processes, like social systems, biochemical networks, ecological systems and computer networks.

DEFINITION 2.1 (Population model).
A population model M is a tuple M = (S, X, T ), where: -S = {1, ..., n} is the set of states of the agents in the population.
-X = (X1, ⋯, Xn) is the state vector, describing the state of the population model.The variable Xi ∈ R≥0 represents the density, concentration, or number (in which case it takes integer values) of components in the i th state.The domain of X, i.e. the state space of the system, is a subset of R n and is denoted by D.
-T = {τ1, ..., τm} is the set of global transitions of the form τ l = (a l , v l , f l ) where • a l is the label of the transition, • v l ∈ R n is the update vector, giving the net change of each counting variable due to the transition, • f ∶ D → R⩾0 is the rate function, giving the rate of the transition as a function of the global state of the system.
Each transition τ l can be seen as a rule of the form where si a , sj b ∈ S are states of the system, and qi, rj are the stoichiometric coefficients, i.e. the amount of components/entities consumed or produced by the transition.The update vector v l condenses the stoichiometric information as ⃗ v l = ∑ b≤h r b ⃗ ej b − ∑ a≤k qa⃗ ei a , where ⃗ ej equals one in position j and zero elsewhere.
The dynamical evolution of these models can be described in different ways: we can interpret them stochastically as a Markov chains or deterministically as a system of Ordinary Differential Equations (ODEs).
Stochastic dynamics.The most important class of stochastic processes we will consider are Continuous Time Markov Chains (CTMC) [15] that describe population processes (PCTMC).A PCTMC can be represented as a time-indexed family of vectors: (X(t)) t∈T = (X1(t), ..., Xn(t)) t∈T where the vector X(t) corresponds to the state of the system at time t ∈ T, with T ⊆ R⩾0, with each variable Xi counting the number of entities in the i th state at time t.We assume X(t) ∈ D, where D ⊆ N n is the state space of the system.
Given a population model M = (S, X, T ) from the set of transitions T , we can easily derive the formal representation of the PCTMC [10] in terms of its infinitesimal generator matrix as: For each di, dj ∈ D, with i ≠ j, Q(i, j) represents the rate of an exponential distribution, namely the distribution of a random variable modelling the time needed to go from state di to state dj.The diagonal elements of the matrix are equal to and represent the opposite of the exit rate from the state di.For all i, j s.t.∄vτ = di − dj, Q(i, j) = 0.For more details about PCTMC, see for instance [10].
Deterministic dynamics.Due to state space explosion, population models are often intractable when interpreted as CTMC, but they can frequently be approximated by a set of non-linear Ordinary Differential Equations (ODEs).This is the so called fluid approximation [10].The fluid ODEs can be seen as a macroscopic description of the microscopic CTMC dynamics, and, when populations are large, it is a good description of the system behaviour [10].
From a Population model M = (S, X, T ), we can construct a set of ODEs, assuming variables X to be continuous and interpreting each rate as a flow, thus obtaining the vector field defining the ODEs dX dt = F (X).These equations, called also rate equations, can be shown to be a first order approximation of the average transient behaviour of the PCTMC, and, under a suitable rescaling of the variables (dividing by the system size, often the total population), the convergence of the PCTMC to the solution of the ODEs (see [10]) as populations and system size go to infinity can be proved.EXAMPLE 2.1 (SIR MODEL).We will illustrate the previous definition by a simple epidemic scenario involving N individuals.Each individual can be in three different states: susceptible to infection (S), infected (I), and recovered and immune to infection (R).The set of agent's states is then S = {S, I, R} and the counting variables are X = (X S , X I , X R ), where Xi ∈ {0, ⋯, N }.The state space of the system D is a subset of [0, N ] 3 .There are 3 different transitions: ⋆ a susceptible individual can be infected by getting in contact with an infected individual, inf: The population model is then M = ({S, I, R}, T ) with T = {(inf, v inf , f inf )(rec, vrec, frec)(loss, v loss , f loc )}.The stochastic process for this model can be represented by the family of vectors (X(t)) t∈R⩾0 = (X S (t), X I (t), X R (t)) t∈R⩾0 where each Xi(t) ∈ N counts the number of entities in the state i at time t.The fluid approximation of this model, instead, is obtained applying (2) and assuming X ∈ R 3 , and it corresponds to the system of ODEs:

PATCH-BASED POPULATION MODEL
We now adapt the definition and the dynamics of population models to systems embedded in a physical space.
The definition of a spatial population model is strictly related with the choice of the space in which the model is embedded.We decided to work with a discrete space, in particular with weighted graphs ( i.e. graphs where each edge is labeled with a value indicating the cost/weight/distance to pass from a node to the next one).Each node represents a different spatial location (patch), each patch contains a population of agents, described indicating the number of individuals in each state as a classic population process, as explained above.The weight of edges has to be interpreted in a more broader sense than a spatial distance, and as such we will refer to it by the word "cost".The idea is that the cost of an edge can represent more complex information than the sheer distance between to nodes.For instance, in a traffic model, we may want to distinguish two streets with the same distance but different travel times due to the presence of traffic lights or congestion.We now give the formal definition of a patch-based population model.• L is the set of locations (nodes), • w ∶ E → R is the function that returns the cost of each edge.
Furthermore, we denote by E * the set containing all the pairs of connected locations, i.e. the transitive closure of E, extending w to the domain E * as the sum of costs of a shortest path between two different locations1 , -V = {ν1, ..., ν k } is the set of inter-patch transitions, i.e. the transitions that describe the migration of entities between paths, each transition is of the form ν l = (a l , s, g l ), where: • a l is the label of the transition, • s ∈ S is the state of the entity that migrates is the rate function, where D is the state space of the system; g l (X, i, j ) is the rate for the migration of a component in the state s from location i to location j when the global state of the system is X.
The dynamics of these kind of models is given by intra-patch interactions and inter-patch migration of agents resulting in a PCTMC or in a ODEs system where the variables represent the number/ density of each agent class in each location.In the following, we refer to the description of such a process by a family of of vectors X(t, ) t∈T, ∈L = (X1(t, ), X2(t, ), ..., Xn(t, )) t∈T, ∈L , indexed by time and space.However, for practical purposes, e.g. to numerically solve ODEs, we consider one (time-dependent) variable X i, for each pair of states and locations.

SIGNAL SPATIO-TEMPORAL LOGIC
Signal Spatio-Temporal Logic (SSTL) is an extension of Signal Temporal Logic (STL) [24] with a spatial operator suitable to specify spatial behaviour of complex systems.The new spatial operator is interpreted on patch-based quantitative population models with a discrete space structured as a graph, as defined in the previous section.
Given a system under study whose state is described by the vector X = (X1, ..., Xn), with space state D, the evaluation of formulae is done over real valued traces ⃗ x ∶ T × L → D, generated by the simulation of the model, where T is the time set, L is the set of location and D is the trajectory state space of the model.We write ⃗ x(t, ) = (x1(t, ), ⋯, xn(t, )), where xi is the projection on the i th coordinate.

Syntax
The syntax of SSTL is given by [w 1 ,w 2 ] ϕ where conjunction and negation are the standard boolean connectives, [t1, t2] and [w1, w2] are real positive dense intervals with t1 < t2 and w1 < w2, U [t 1 ,t 2 ] is the until operator and [w 1 ,w 2 ] is the spatial somewhere operator.The atomic predicate µ ∶ D → B is defined as µ(⃗ x) ∶= (f (⃗ x) ⩾ 0), where ⃗ x is the trace, f ∶ D → R is a possibly non-linear real-valued function, and B = {0, 1} are boolean values.
The (bounded) until operator ϕ1 U [a,b] ϕ2 requires ϕ1 to hold from now until, in a time between a and b time units in the future, ϕ2 becomes true.The spatial somewhere operator [w 1 ,w 2 ] ϕ requires ϕ to hold in a location reachable from the current one with a total cost greater than or equal to w1 and less than or equal to w2.The eventually operator F [a,b] and the always operator G [a,b] can be defined as usual: In a similar way, we derive the everywhere spatial operator [w 1 ,w 2 ] ϕ ∶= ¬ [w 1 ,w 2 ] ¬ϕ which requires ϕ to hold in all the locations reachable from the current one with a total cost greater than or equal to w1 and less than or equal to w2.

Semantics
STL has two different semantics: a boolean semantics that returns yes/no depending if the observed trace satisfies or not the STL specification and a quantitative semantics that extends the boolean one by returning a measure of robustness of the specification.We now define both semantics for SSTL logic.x is given by: x satisfies ϕ in location , denoted by (⃗ x, ) ⊧ ϕ, if and only if (⃗ x, 0, ) ⊧ ϕ. x is given by: where ρ is the quantitative satisfaction function, returning a real number ρ(ϕ, ⃗ x, t) quantifying the degree of satisfaction of the property ϕ by the signal ⃗ x at time t.Moreover, ρ(ϕ, ⃗ x, ) ∶= ρ(ϕ, ⃗ x, 0, ).
The definition of this quantitative measure is a reformulation of the robustness degree of [16].The idea is thats the sign of the satisfaction degree corresponds to the truth value of the formula (with positive standing for true), while the absolute value of the score returns the maximum perturbation sustainable by the secondary signals such that the truth value of the formula does not change.For more details about STL and its quantitative semantics see [14].

MONITORING ALGORITHMS
This section is concerned about the monitoring of an SSTL formula; a property monitor is an algorithm for deciding whether a given behaviour or trace ⃗ x satisfies a property ϕ.
We need to provide an algorithm for each semantics that we have defined.We will start from the property monitors introduced in [24] for the boolean semantics and in [14] for the quantitative semantics, extending them with a procedure to check the spatial properties.

Monitoring the Boolean semantic of a STL formula
The algorithm proceeds inductively bottom-up on the parse tree of the formula.Given a formula ϕ, to determine if A central notion in the monitoring algorithm is that of the minimal interval covering Is ψ, consistent with the signal s ψ, .it corresponds to the shortest finite sequence of left-closed right open intervals I1, ..., I h such that ⋃ j Ij is an interval, Ii ⋂ Ij = ∅, ∀i ≠ j, and s ψ, (t) = s ψ, (t ′ ), ∀t, t ′ belonging to the same interval. 2We can derive then the set of positive and negative intervals of s ψ, as: The positive interval I + s ψ, corresponds to the satisfaction set of the formula over the signal s ψ, .Note that any signal can be written as s = s1 ∨ s2 ∨ ⋯ ∨ s k where each si is an unitary signal, meaning that it has a singleton positive interval, i.e.I + s i = {[t1, t2)} for some t1 < t2 ∈ R⩾0.Furthermore, Is ψ, = I + s ψ, ⋃ I − s ψ, and Using these definitions of signals, interval coverings, and satisfaction set, the procedure for the classic operators of STL is equivalent to the one described in paper [24].We briefly recall it in the following.
Atomic Predicates: ψ = µ.The computation of the boolean signal associate with an atomic predicate is a direct application of Definition 4.1: s µ, (t) = µ(⃗ x(t, )).
Negation: ψ = ¬ϕ, then I + s¬ϕ, = I − sϕ, , Disjunction: ψ = ϕ1 ∨ ϕ2, then, given s ϕ 1 , , s ϕ 2 , , let be I the minimal interval covering consistent with both signals.F or each Ii ∈ I, we construct the signal s ψ, (Ii) = s ϕ 1 , (Ii) ∨ s ϕ 2 , (Ii) and we merge adjacent positive intervals to obtain I + ψ, .Until: ψ = ϕ1U [a,b] ϕ2.As we are working with future temporal modalities, we need to shift intervals backwards.This has to be done independently for each unitary signal, then taking the union of the so obtained satisfaction sets.Given two unitary signals p and q, the signal ψ = pU [a,b] q is the unitary signal such that In the general case, let s ϕ 1 , = p1∨⋯∨pn and s ϕ 2 , = q1∨⋯∨qm be signals written as union of unitary signals, then The proof of this result can be found in [24].
We now treat the spatial somewhere operator ψ = [w 1 ,w 2 ] ϕ.As remarked at the beginning of this section, and relying on the fact that we have a finite number of locations, we can process independently each location in the signal.Given the signal s ψ, , for a fixed location , we can rewrite the spatial operator as a disjunction between all signals in locations ′ s.t. ( ′ , ) ∈ E * and w1 ⩽ w( ′ , ) ⩽ w2.This allows us to use the monitoring procedure for disjunction, constructing the minimal interval covering I consistent with all s ϕ, ′ signals s.t.( ′ , ) ∈ E * and w1 ⩽ w( ′ , ) ⩽ w2, and defining, for each Ii ∈ I.
The satisfaction set I + s ψ, is then the union of the positive Ii (i.e.Ii s.t.s ψ, (Ii) = 1), merging adjacent positive intervals.
We stress here that the introduction of the spatial somewhere operator is not merely syntactic sugar, for two reasons.First, its definition can be applied also to countable discrete spaces, and it can be easily generalised to continuous spaces.Secondly, even assuming a finite discrete space, expanding it as a disjunction would produce a blowup of the formula size exponential in the nesting level of spatial operators, and hence an exponential increase in the complexity of the monitoring procedure.

Monitoring the quantitative semantic of a STL formula
Given a formula ϕ, this algorithm computes the quantitative satisfaction function ρ(ϕ, ⃗ x, t, ).Similarly to the Boolean semantics, ρ is computed inductively on the formula structure, proceeding bottom-up along the parse tree of the formula.For each node of the tree, the algorithm maintains a real-valued satisfaction signal associated with the corresponding subformula, processing the real-valued signals of the children nodes.

Secondary signals are obtained from the trace ⃗
x = (x1, ⋯, xn) (also called in this context primary signal) by the pointwise extension in time of the the function f (⃗ x[t]), where µ ≡ f (⃗ x) ≥ 0, as prescribed by Definition 4.2.More specifically, the primary signals xi(t, ) are assumed to be piecewise-affine functions, obtained from a finite number of sample points by linear interpolation.The same approximation applies to the secondary signals, that will be called y (t) = y (x1(t, ), ..., xn(t, )), and which can be seen as piecewise-affine functions with a finite number of segments, so that their derivative changes in a finite sequence of time instants {ti}1<i<n y .We need to assume that y is right-continuous and admits everywhere a right-derivative These secondary signals are thus represented by a sequence (ti, y (ti), dy (ti))i<n y , along with a final or cut-off time tn y .Furthermore, they can be extended to t ∈ [0, ∞) by assuming y (t) = y (t1) for all t < t1 and y (t) = y (tn y ) for all t > tn y .
Note that, in accordance with the boolean semantics, we will treat also in this case the location as a fixed coordinate in the secondary signal.In this way the computation of the signals associated with the negation (¬), disjunction (∨), and until (U [a,b] ) will be the same as in the monitoring algorithm described in [14].Due to the complexity of such algorithm, especially for the temporal operator, we refer the reader to [14] for a comprehensive description.
Here we limit ourselves to describe the monitoring procedure for the spatial somewhere operator [w 1 ,w 2 ] .From Definition 4.2, we know that where the sequences representing the signals ρ(ϕ, ⃗ x, 1) and ρ(ϕ, ⃗ x, 2), which are assumed known by induction.We now build the sequence of time points (ri)i<n z , containing the sorted sampling times of y 1 and y 2 and the time points in which the corresponding piecewise affine functions intersect.Note that nz ⩽ 4 × max{ny 1 , ny 2 } because there can be at most n 1 + n 2 intersections and no more than n 1 + n 2 unique sampling instants.For all i < nz, we let z(ri) = max(y 1 (ri), y 2 (ri)).and dz(ri) = max(dy 1 (ri), dy 2 (ri)).Now, if ∃ ′ ∈ B s.t.′ = 1 and ′ = 2, we apply the same algorithm to (ti, y ′ (ti), dy ′ (ti))i<n y ′ , the sequences representing the robustness signal ρ(ϕ.⃗ x, ′ ) and the sequence (ri, z(ri), dz(ri))i<n z obtained from the previous computation, iterating this process until all the locations in B have been considered.The resulting sequence (ri, z (ri), dz(ri))i<n z represents ρ( [w 1 ,w 2 ] ϕ, ⃗ x, t, ) as a pairwise affine function.

STOCHASTIC SEMANTIC
We discuss now how to extend the definition of the semantics of SSTL to a stochastic system, following [5].Each SSTL formula ϕ is interpreted over a trajectory ⃗ x = ⃗ x(t, ).Let us denote by D the space of all possible trajectories of a CTMC ⃗ X(t, ) associated with a population model M. A standard mathematical construction [8] turns this set into a metric measurable space.The CTMC ⃗ X(t, ), in turn, induces a probability measure on D. Furthermore, we can interpret the boolean and the quantitative semantics of ϕ as functionals on the space D, assigning to each trajectory ⃗ x its corresponding truth value according to the boolean semantics or its satisfaction degree according to the quantitative semantics.In the boolean case, such a functional is Iϕ ∶ D → {0, 1}, such that Iϕ(⃗ x) = 1 if and only if ⃗ x ⊧ ϕ, hence it identifies the subset of trajectories of D that satisfy the formula ϕ.It follows that An analogous construction allows us to extend the robustness of SSTL to PCTMC models: given a trajectory ⃗ x(t, ) of the PCTMC, its robustness ρ(ϕ, ⃗ x, 0) can be seen as a functional Rϕ from the trajectories in D to R. It can be proved that Rϕ is measurable (with respect to the Borel σ-algebra of the standard topology of D) [5].From this assumption, we can define the real-valued random variable Rϕ = Rϕ( ⃗ X) with probability distribution: More details about this definition can be found in [5].Applying this definition to a stochastic model, we obtain a distribution of robustness degrees which can be summarised by the average robustness degree of this distribution, E(Rϕ), giving a measure of how strongly a formula is satisfied.This means that the satisfaction will be more robust when this value will be higher.
Numerical methods to compute the satisfaction probability or the average robustness of a STL formula are not available, hence we resort, as in [5], to statistical estimation, leveraging the monitoring routines presented in the previous section.In particular, we use a Bayesian statistical approach to estimate the satisfaction probability [22], and classic statistical tools for the average robustness degree [5].

CASE STUDY
As a case study, we consider a model of an epidemic of cholera, an infection of the intestine caused by the bacterium Vibrio cholerae and a prominent example of a waterborne disease [7,25].Indeed, typically the infection is transmitted by contaminated food or water.An explicit modelling of the hydrological space where the infection propagates is therefore a crucial aspect to understand and analyse this kind of disease.We represent space as a weighted oriented graph, shown in Figure 1.The nodes represent the human communities and the edges describe the links between water basins of different areas.The idea is to analyse the diffusion of an epidemic along the communities that live close to a river, so that edge directions are chosen according to the direction of the water flow.There are two agent classes: the bacteria and the individuals.The bacteria have only one state (B) but they can be transported to different nodes via the river.An individual, instead, can be in three different states: susceptible (S) infected (I) and recovered (R), but cannot change location.Ignoring human mobility is a simplification justified by the fact that our focus here is to illustrate the logic at work, rather than presenting a fully realistic model.Extension to more complex scenarios, however, are relatively straightforward [25].
The model of this system has state variables X S , X I , X R , X B , counting the number of susceptible, infected, and recovered individuals, and the concentration of the bacteria in each node.Discrete space is described by the graph in Figure 1, which is equipped with a weight function w ∶ E → R that returns the cost of each edge.In this specific case, we will interpret the cost w( i, j ) of an edge between nodes i and j as the distance between them.Recall that if two locations are not directly connected, w is equal to the sum of costs of a shortest path between the two locations; for example, in Figure 1, w( 1, 2) = w( 2, 5) + w( 1, 5) = 9.The movement of bacteria from a location i to a directly connected location j is specified by the inter-patch transitions νmov = (mov, B, gmov)..s Here gmov(X(t, ), i, j ) = lpijX B (t, i) is the rate function, where l is the total water flow (assumed equal for all nodes) and pi,j is the fraction of water flowing out of node i which reaches node j .Such a rate function describes the bacteria mobility as a passive mobility due to the water flow.The sum of the values pij associated with the outgoing edges of each location has to be equal to one, i.e ∑ k ( i , k )∈E p i,k = 1.The other transitions describe the change of the individual state, according to a model similar to the SIR epidemics introduced in Example 2.1, with the notable difference of the infection rate, which now depends on the concentration of bacteria in water and not directly on the number of infected individuals.We also consider transitions describing death or growth of the bacterium population.
The patch population model is then ((S, X, T ), G, V), where S = {S, I, R, B} is the set of states, X = (X S , X I , X R , X B ) is the state vector, G = ({ 1, ..., n}, E, w) is the graph as in Figure 1, V = {(mob, B, g mob )} is the set of inter-patch transition, as described above, and T is the set containg the following intra-patch transitions.More specifically: ⋆ τ Snat is the natality transition: ∅ → S, with rate f Snat (X) = µH( ), where µ is the human mortality rate and H( ) is the size of the community in location , ⋆ τ Smort is the mortality of a susceptible individual: S → ∅, with rate µ ⋆ τ Imort is the mortality of a infected individual: I → ∅, with rate µ + α, where α is the mortality rate due to cholera, ⋆ τrec is the recovery transition: I → R, with rate γ, ⋆ τ Bdeg is the degradation transition: B → ∅, with rate µ B ⋆ τ Bgrowth is the bacterial growth transition: ∅ → B, with rate function f Snat (X) = p W X I , where p is the rate at which bacteria are produced by one infected person and W is the volume of the contaminated water.
We implemented this model in Matlab, both as a set of differential equations and as a stochastic process.In order to compute the boolean and the quantitative semantics of SSTL formulae, we exploited the routines provided by the Breach toolbox [13] to check STL formulae.ODEs described the dynamics of the systems, obtained from the prescriptions of Section 2, have been numerically integrated using a Matlab built-in solver.The stochastic interpretation of the population model, instead, has been analysed by a dedicated implementation combining standard Monte Carlo simulation (by the Gilllespie algorithm [19]) and Bayesian statistical model checking [22].
The first spatio-temporal behaviour that we consider is how the epidemic propagates along the river.The idea is to consider a model that starts with the infection in just one location, 1, and then to check if the infection has propagated at a certain distance from 1 after a certain time.This behaviour can be captured by the SSTL formula: verifying it in location 1.The exact meaning of the formula is: eventually, in less than T inf unit time, the number of infected individual becomes more than c inf in a location with w( 1, ) ∈ [w1, w2], i.e. at a distance from location 1 equal or greater than w1 and equal or less than w2.
We analyse a system with 7 locations, all with the same size, with distance between locations w( i, j ), in agreement with the green integer values labelling the edges in Figure 1.As initial variables we set, for all the locations apart 1, the same number of susceptible individuals, X S = 500, and we start with zero infected individual, X I = 0; instead, for 1 we set 100 infected individuals and 490 susceptibles.There are no recovered individuals at the beginning.We set also the bacterium concentration equal to zero in all the locations except 1.The parameters of the model have been set as µ = 0.0005, H( i) = 500, β = 1, K = 5, γ = 0.2, α = 0.0004, µ B = −0.228,p = 0.2, w = 50, l = 0.5 while the matrix (pi,j)i,j≤7 has as non-zero entries those specified in Figure 1.Note that for the last two locations the sum of probability is not equal to one, to capture the continuation of the river.For the formula parameters, we set w1 = 12 and w2 = 15, c inf = 150 and T inf = 5 unit times.
In Figure 2, we show some results of the monitoring of this formula.In the top panel, we consider the ODE interpretation and show how the robustness score increases as a function of the mobility rate.In the bottom panel, we show the empirical distribution from 1000 runs of the robustness degree from the stochastic model, with vertical lines denoting the average (red lines) and conditional averages on the formula being false/ true (green lines).Changing the mobility rate in the stochastic model from 0.3 to 0.6, the satisfaction probabilitty varies from 0.397 to 0.975, while the average robustness score varies (monotonically) from -16.93 to 52.72.We stress here how the availability of the robustness score both in a deterministic and a stochastic context can lead to tools to design systems, trying to robustly match spatio-temporal behavioral specifications in SSTL, using e.g. the ideas of [5].
In order to illustrate in more detail the expressive power of this logic, we discuss now two additional properties, using the following building blocks: The first one is [0,dmax] (ψ1 → ψ2) , stating that a large infection (with at least c inf individuals) happening at some time t ∈ [0, Tstart] and localised within distance wnear from a given reference point, will spread further away, at a location at distance  between w f ar and dmax, at some time t ′ ∈ [Tstart, Tstart + DT ].Furthermore, this is true for every reference point (say at distance at most dmax from 1).Checking this formula on 1000 runs of the stochastic model, with parameters wnear = 3, w f ar = 13, dmax = 15, Tstart = 1 and DT = 4, we obtain a satisfaction probability of 0.9220 ± 0.0085 (all results reported at 95% confidence), and an average robustness degree of 42 ± 0.8771.The robustness score for the ODE interpretation, instead, is 30.26.
The other formula we consider is ψ2 → ψ1, stating that a high infection level at a far-away location at a late time must have been high at a nearer location some time before, i.e. that the current reference point is close to the epicentre of the epidemic.In this case, with the same parameters of the previous formula, verifying it in location 2, we obtain a satisfaction probability of 0.997 ± 0.0017, an average robustness score of 55.48 ± 0.5677 and a robustness score of the ODE model of 58.08.
In general, we can observe how the ODE interpretation generates robustness scores that are close to the ones of the stochastic model, at a much cheaper computational price.This may be the effect of some convergence theorem at work, and may be exploited for system design purposes, as numerically solving ODEs is in general computationally cheaper than analysing or simulating stochastic models.

DISCUSSION
In this paper we presented Signal Spatio-Temporal Logic, an extension of Signal Temporal Logic with spatial operators.First we formalized the definition of patch-based population model, embedding a "classic" homogeneous population model into a discrete space, represented as a weighted graph, and specifying movements by dedicated inter-patch transitions.Then, we introduced SSTL, gave it both a boolean and a quantitative semantics, and extended the definition and the monitoring algorithms presented in [14,24].As a case study, we discussed an epidemic scenario of the spreading of cholera among neighbouring communities, where space plays a crucial role, showing how our logic is suitable to describe spatiotemporal behaviour as the epidemic propagation along the river.In this specific example we worked only with few patches, but the same type of analysis can easily be applied to more complex spatial structures, even with thousands of nodes.
As for future work, we want to apply our approach to transport networks in the context of smart cities, in particular to bike sharing and bus transport networks.Another direction to investigate is the extension of the semantics over a continuous space.In this case, instead of Patch-CTMC or an ODE system we have to work with spatio-temporal point processes or PDE systems.The main challenge is the design of efficient monitoring algorithms, as we cannot rely anymore on simple extensions of those for STL.Another possible improvement is to consider a richer spatial operator like the spatial until, which describes bounded regions, extending its interpretation with metric constraints [11].Finally, we plan to release an implementation of the monitoring algorithms, extending the system design procedures of [5] to spatio-temporal properties.We will also consider the problem of learning spatio-temporal formulae satisfied by a model with high probability, or effectively discriminating two models, combining ideas of [6] with the spatio-temporal machine learning tools of [28].

3 Figure 1 :
Figure1: The graph of the population spatial distribution.The nodes 1, ..., 7 represent the different communities; the edges represent the connection between the communities through the water basin, with the verse of the rows chosen in agreement with the river flow.The red numbers correspond to the probability pi,j to go from i to j and the green numbers correspond to the values w of the distance between locations..s

Figure 2 :
Figure 2: (a) Dependency of the robustness degree on the mobility rate (for the ODE interpretation of the model).(b) Empirical robustness distribution of the formula 3 with w1 = 12 and w2 = 15 and T inf = 5, obtained from 1000 simulation runs.
we construct, for every sub formula ψ and every location ∈ L, a boolean signal s ψ, [t], i.e. a function from [0, T ] to {0, 1} s.t.s ψ, [t] = 1 if (⃗ x, t, ) ⊧ ψ and 0 otherwise.At the termination of the algorithm, we have the signal s ϕ, [t] whose value at t = 0 determines the satisfiability of ϕ.Note that we are fixing in the signal both the location and the subformula, because the properties can be verified over each location independently, due to Definition 4.1: (⃗ x, ) ⊧ ϕ means "the trace ⃗ x in location satisfies the property ϕ ".