CITE – Content Interaction Time and spacE: a hybrid approach to model man-robot interaction for deployment in museums

In this paper , we present a generic model, CITE (Conten t-InteractionTime-spacE) devoted to the development of interactive applications for places of culture such as museums or theaters. The idea is to introduce new kinds of interactions with visitors while preserving the quality of the information they can acquire into the museum. We firs present two previous works involving respectively a bottom-up model (LINK H/R) in the context of science-art interaction, and a top-down model (PLUG) which was an interactive game in a museum. The CITE model is then presented as a trade-o between those approaches: details are given concerning the Content and Inter action management; and an extension is considered concerning Time and spacE handling. Received on 01 September 2016; accepted on 03 September 2017; published on 09 October 2017


Introduction
Computer science and robotics have long been kept apart from cul tur al issues since they were considered as prototypes of dehumaniza tion.Paradoxicall y, the represen tation of computers and robots have alw ays been very presen t in liter ature and movies.It is well known, for instance, tha t the term robot itself has been introd uced in a nov el by Karel Čapek during the 20' s.And there is no need to men tion HAL 9000 the mad computer in Kubrik' s "2001, a space odyssey" movie or R2D2 and C3PO, the star robots of "star wars" .
Step by step, computer science has emerg ed in the cul tur al fie d and has became a new tool to facilita te the crea tion of works of art.Actuall y, in music, computers are now ada ys as common instrumen ts as violins were in the 17 th cen tury .More recen tl y, robots has begun to be introd uced in choreogr aph y such as in the "Robot !" show choreogr aphed by Bianca Li.
But cul ture does not onl y mean art: it has a wider sense tha t implies a larg er an heterog eneous public.Museums, for instance, are trying to giv e an access to * Corresponding author .Email: armelle.prigent@univ-lr.frcul ture to a larg e audience but they also convey the will to transmit knowledg e.
In this article, we are interested in considering wha t kind of robotic and computer models shoul d be dev el oped to all ow the introd uction of robots in museums, tem ples of cul ture, introd ucing new inter actions with visitors, especiall y young ones, without degr ading the quality of knowledg e they can acquire.
Traditionall y, tw o kinds of approaches are proposed for elabor ating conception models: bottom-up and topdown.In the firs part of this article we presen t tw o previous experimen tation in rela tionship with arts and cul ture: the firs -Link H/R adopts a bottom-up approach, the second -PLUG, top-down consider ations.Both has found di fficul ties to fulfil their goal entirel y.In the second part of this article, we propose the CITE model, adopting a mixed approach, as a trade-o ff betw een bottom-up and top-down approaches.

Link H/R: a bottom up approach
Context.Link Human/Robot is the resul t of an interdisciplinary collabor ation betw een an artist , a dancer , a musician, a drawer and a roboticist.This project crosses di fferen t angles and expertise from those di fferen t fie ds.It is interdisciplinary by essence.The dial ogue betw een the di fferen t disciplines dev el ops three dimensions: social, scien tific and artistic.The premises of the project was the common interest of the artist and the roboticist for the work of Varela [21 ] and the concept of autopoiesis.
A system is "autopoietic" if it can reprod uce and main tain itself .Varela claims tha t autopoiesis, autonom y and cognition are intrinsicall y linked with each other .Originall y dev el oped by Matur ana and Varela in the fie d of biol ogy, the concept has been enlarg ed to artificia intellig ence (especiall y wha t is called "nouv elle AI" -see for exam ple [1,4,18 ]) and embedded robotics (for instance [17 ]).
The project by itself is a choreogr aphic perf ormance where a dancer and the humanoid robot Nao dev el oped by Aldebar an Robotics are facing each other .Based on a turn taking imita tion process inspired from the seminal work of J.Nadel [11 ], each ag ent buil ds its own choreogr aphic motor repertoire through an accum ula tion of gestures.In tha t sense the robot (respectiv el y the dancer) foll ows an acceler ated dev el opmen tal process equiv alen t to the psychol ogical dev el opmen t of chil dren [16 ].
Initiall y, we intended to dev el op a process where shared choreogr aphic phr ases and kinds of mutual interpreta tions and reciprocal improvisa tions woul d emerg e.The robot , respectiv el y the dancer , woul d not mime each other: they woul d react , interpret , and they woul d take the choreogr aphic ma terial from each other .This woul d have been a kind of dial ogue with mutual inf uence.
In pr actice, the constr ain ts imposed by the necessity to presen t the choreogr aphic perf ormance in fron t of a public has led to "guide " the dev el opmen t of the robot and to artificial y sequence the di fferen t steps of the choreogr aph y.Theref ore, the original robotic control architecture, tha t we woul d have liked to foll ow Varela 's autopoietic principles has turned into a series of individ ual beha vior al componen ts man uall y trig gered by a human being in order to respect the timing imposed by the choreogr aph y.
Control architecture.Initiall y, we intended to use an adapted version of the PerA c (Perception-Action) control architecture [6] to pil ot the robot.This biol ogicall y inspired architecture has been proposed by Gaussier and Zrehen as a generic neur al buil ding bl ock for robotic controllers.In a bottom-up perspectiv e, the PerA c architecture is composed of tw o da ta streams corresponding to perception and action f ows.The lower layer is a kind of "reflex mechanism directl y controlling the actions thanks to the raw inf orma tion extr acted from the perception input.The upper layer uses a learning mechanism to perf orm recognition of processed elemen ts extr acted from the perception f ow.Theref ore, it all ows to learn associa tions betw een the recognition of a specifi perceptual elemen t and a particular action and thus extend the refle beha viors.The main principle of the PerA c architecture, in a dynamic system perspectiv e, is tha t the system ev olves due to the dynamic inter action betw een the robot and its environmen t and can reach a beha vior al equilibrium [3].This control architecture has been used in man y domains of robotics such as: indoor and outdoor navig ation and planning [5,7,14 ] object recognition [10 ] and in particular in imita tion [2,15 ] which is the closest to our consider ation.
Ideall y, we woul d have liked to dev el op a single neur al architecture all owing to learn by itself how to inter act with the dancer and to learn from her a new action repertoire in order to progress and exchang e via body comm unica tion.How ev er, pr acticall y, this woul d have required much technical dev el opmen t time to integr ate all the beha viors in the controller , but also, robotic dev el opmen tal time for the robot to be able to incremen tall y learn (which is a long and uncertain process) the mov emen ts from the dancer .
For scenaristic and narr ativ e reasons, we were then forced to chain the di fferen t beha viors composing the whole choreogr aph y.All the beha viors were thus conceiv ed as small independen t PerA c bl ocks with their own timing and autonom y.How ev er, the orchestr ation of the bl ocks was deleg ated to a human supervisor trig gering the beha viors when needed.This is of course not adequa te since the human is performing a major part of wha t the control architecture shoul d have done: planning and sched uling the actions.The main drawback is tha t no overview of a curren t situa tion is available in order to trig ger the appropria te beha vior in a "top-down " approach.CITE -Content Interaction Time and spacE: a hybrid approach to model man-robot interaction for deployment in museums

A top down approach: PLUG
PLUG was an interdisciplinary project tha t intended studying technol ogies for ubiquitous games and their sociocul tur al, economic and ind ustrial acceptability .
Context.. Two iter ations were needed to dev el op this game.In the firs one, among the 4000 items of the museum 's (musée des arts et métiers de Paris) collection, a set of 16 items were selected to compose the conten t of the game.These objects were split into into four collections and virtual pla ying cards were associa ted with them.These virtual pla ying cards were stored in RFID terminals.The pla yers used mobile phones to read, exchang e or drop cards.The game objectiv e was to gain as man y poin ts as possible by completing the collection (expl ore the terminals, exchang e cards with other pla yers, and answ er quiz).
The second prototype of the project aimed at incl uding more perv asiv eness and ubiquity in the game.For tha t purpose, pla yers had been provided with equipmen t as sensors to measure ph ysiol ogical da ta.These da ta were integr ated into the game to chang e the course according to the bio-ph ysical condition of the pla yers.In comparison with the firs iter ation, the game design was more directed tow ards the acquisition of scien tifi conten ts on the objects than a sim ple card collection.
The goal of the game for pla yer was to discov er the historical and technical "memory" of the objects : • Historical, societal memory: knowing to whom is due a particular invention?Why this innov ation appeared at tha t time?
• Technical memory objects: wha t where the problems sol ved by this innov ation?
At the end of the game session, pla yers must have understood the technical aspects but also the reasons tha t led to the crea tion of this object , as its relev ance to the aspir ations of society tha t is contem por ary to him.This game is composed of a set of enigmas for pla yers: a set of intermedia te elemen ts of discov ery (ID) associa ted to a targ et object (TO).First , the research begins with a location enigma (the team has to fin the room in which is located the ID).When the team has valida ted its position on entering the room, another enigma is provided on the iphone and the team has to seek and fin the object in the room.Then the team valida tes the resul ts of his research bef ore looking for another ID.
Adaptive execution.The proposed adaptiv e engine is based on an architecture dedica ted to perf orming realtime observ ation of the pla yers, and making decisions on the scenario in in order to offer the best experience based on their observ ed beha vior .This architecture is based on a model of this experience, incl uding both the represen tation of the beha vior of the pla yers (through situa tions where it can be) and the decisions to be implemen ted to adapt the experience to their particular cases (through a set of rules for adapta tion).Here, the challeng e is to integr ate the mechanism of observ ation-decision in the game serv er on which pla yers connect.F irst , we have to insert in game engine, observ ation's poin ts on the low-lev el ev ents to monitor the beha vior of the pla yers (for exam ple, we observ e ev ents like "enter a room", "open a new enigma."..).Then, we store them in a da tabase tha t is used by the observ ation mod ule to buil d, in real-time,the sta te symbolic tha t represen t each pla yer.Inter activity shoul d be considered as a feedback loop in which each involved entity adapts itself to the beha vior of the other .Main taining a pla yer in a giv en narr ativ e framew ork can be complex, particular ly in the context of a perv asiv e mul tipla yer game.The foll owing beha viors are to be supervised beca use they can endang er the experimen t.
• Progression: slow, lack of motiv ation, misunderstood of gaming rules, bl ocking • Social attitude: generous or competitiv e pla yer (acceptance or refusal of systema tic exchang es ...), independence (request for help, too much or too little), proximity with another team • Strategy: imbalance in the str ategies empl oyed (the pla yer promotes a way of pla ying compared to another).
Then, the adaptiv e engine perf ormed observ ation and proposed adaptiv e ev ents.A firs step is to receiv e ev ents from the "clien t game logic" (pla yer's beha vior in the game) and the receiv ed sensor signals.These was transmitted to a serv er.This last proposed real-time modific tion of the game execution, according to da ta interpreta tion.A game master selected adapta tions to be perf ormed among these proposals to ensure game dynamic.
Symbolic states and interaction model.The proposed symbolic sta tes are vectors of sta tes buil t from par ameters of the experimen t.Thus, based on the observ ations, the manag er of symbolic sta temen ts buil ds a sta te S p =< P S, P L, St, So > for each pla yer p with : • PS: the speed of progression (blocked, slow, normal or fast) • PL: the lev el of progression (low, normal or high) • St: the lev el of stress (quiet , calm or excited) • So: the lev el of social inter actions (competitiv e, social or generous) If it is necessary to know the sta te in which the pla yer is, it is also importan t to anal yze his sequences of actions.Indeed, in the PLUG project , we woul d have liked to be able to know the past actions of the pla yer and his poten tial futures actions.The second adv antag e of such a model stands in the ability to authorize or inhibit an inter action in the light of the observ ed sta te.
In order to model the user' s actions, we proposed a firs approach tha t consist in represen ting all the possible actions (whether those permitted to the user or those executed by the system) through automa ta netw orks.This method is based on a reachability anal ysis perf ormed through the model-checking tool UPPAAL to gener ate the optimal pa th to an expected sta te [12 ].Although this method of adapta tion has been prov en effectiv e, she becomes di fficul t to be used in the case of a system tha t handles a larg e number of sta tes and inter actions.Indeed, buil ding a complete system can be tedious for the designer .
The major di fficul ty stands in the represen tation of all the possible inter actions betw een the user and the narr ativ e entities manipula ted by the system.The model must guar antee a high lev el of mod ularity , reusability and needs to be scalable enough to meet poten tial chang es.In addition, supervision models requires having a formal model from which it is possible to extr act the most relev ant pa ths.

Hybrid approach: the CITE architecture
In our innov ativ e work, we propose a mul tidimensionnal approach: the CITE model (Conten t, Inter action, Time, spacE) will represen t the di fferen t dimensions of the experience.The firs one will represen t all the conten ts handled in di fferen t forma ts (some text shown on screens, sounds, imag es, videos ...).The second dimension concerns the inter action represen ted as a protocol, based on the work cond ucted by [13 ] (timed automa ta netw orks).
The time taken into accoun t, in the third dimension of the model, is mul tiple.First there is the time of the actors (it can be the visitor or di fferen t inter action elemen ts), then the time for each narr ativ e situa tion and final y the overall time of the experience.The last dimension concerns space and can model the various places of dissemina tion of conten t or execution of inter actions.
This model has tw o adv antag es.It will be the basis of qualita tiv e anal ysis and improv emen t of the scenario.It will also mov e tow ards a generic methodol ogy of bringing into use of such an experience in di fferen t places of heritag e.
In the rest of this section, we will firs presen t the already implemen ted CI part of the model then giv e an overview our hybrid approach and final y giv e perspectiv es on how the model shoul d be extended to take Time and spacE into consider ation.

CI model: A situation-based approach for interactive storytelling
For the tw o dimensions C and I, we have proposed a high-lev el model for the represen tation of the pla yer's sta tes and possible actions in each sta te.Inspired by narr atol ogy, this situa tional model is implemen ted on three lev els.
We proposed an approach based on an extension of I/O automa ta netw orks [9,20 ].The main idea of this model is to represen t narr ativ e situa tions tha t will be automa ticall y computed from the beha viors of each entities acting in the situa tion.This model is inspired from narr atol ogy theory and uses a part of the vocabulary of [8].The principle is based on the definitio of mod ular beha viors for the entities of the system, the automa tic syn thesis of the automa ton represen ting entity beha vior and final y the definitio of narr ativ e situa tions involving these entities.The gl obal system model is constructed by instan tia tion of the generic entities define in an abstract layer tow ard an implementation layer (Actan t → Actor; Situa tion → Scene) and the definitio of scenes in which these entities oper ate.We also provide a supervision model, all owing the designer to specify the execution order of the previousl y define scenes, by appl ying a set of conditions.The last layer, here, represen t the possible sequences of scenes exection through a directed graph of scene implemen tations called "pl ots" .Abstraction layer..In our three step approach, the abstr action layer is the one tha t help the designer to describe a set of narr ativ e entities tha t will be involved  in situa tions.It define the types roles (actan ts) and generic situa tions.We consider here the tuple (A, S, V G ) composed of A a set of actan ts, S a set of situa tions and V G a set of gl obal integ er variables, tha t can be manipula ted by all the actan ts.Let V e g be a val ua tion of all gl obal variables in the abstr action layer (here called sta te vector).
Each actan t is described through local variables and actions.An action is an abstr act concept tha t is specialized as a reflexi e action, an opposite action and an inter action.This extension all ows represen ting particular pa tterns of automa ta.An action may involve the actan t which it origina tes, or trig ger a beha vior in another actan t: this is called inter actions betw een actan ts.This offers the possibility of adding a synchroniza tion char acter to another beha vior .Implementation layer.. Entities define in the abstr action layer are generic.This second layer, called implementation layer, is used to implemen t generic entities define in the abstr action layer into concrete entities.Thus, actan ts are instan tia ted by actors and situa tions by scenes.This mechanics of instan tia tion all ows to crea te sev eral actors from a single actan t model and all ows the mul tiple inheritance (an actor implemen ts the beha vior of sev eral actan ts).We defin the instan tia tion layer by the foll owing tuple (A cr , S c ) composed of a set of actors and a set of scenes.Actor can implemen t the role of one or more actan t(s).Thanks to inheritance, the actor possesses beha viors it then implemen ts.Mul tiple inheritance all ows the designer to specialize an actor according to the beha viors of di fferen t actan ts it implemen ts.

Dynamic layer.. This layer define
the execution sched uling of scenes declared in the instan tia tion layer.
Here, the scenes are encapsula ted in an entity: the plot.This overcoa t specifie the elemen ts necessary for the realiza tion of the stag e.The pl ot is define by the tuple (sc, G, S sup ).Here sc ∈ S is a scene of the instancia tion layer, G is a set of constr ain ts tha t imposes a val ue to the sta te vector of a specifi actor , bef ore the completion of the scene and S sup is a set of successors pl ots.
This method presen ts tw o adv antag es.First , the design is based on three layers tha t helps mod ularity and reuse.The abstr action layer helps the designer to constructs generic entities (actan ts) and a generic way of combining them.The second one (instan tia tion layer) is dev oted to the implemen tation of generic entities and supports mod ularity by mul tiple inheritance of actan ts into concrete actors and combina tions of actors in a concrete situa tion of execution (so called a scene).Finall y, the dynamic layer helps org anizing those concrete scenes into execution schemes.This approach constitutes a grea t support for extensible systems and reuse of entities for modeling.The second contribution concerns the dynamic construction of the system 's gl obal finit sta te automa ta from beha viors.Here, the designer has onl y to describe atomic beha viors for actan ts.The automa ta executing the situa tions is constructed automa ticall y.

An hybrid approach
The CITE model is an ev olution of the model based on situa tions tha t takes into accoun t the specificitie of ubiquitous experience, but can still be improv ed through a hybrid approach constructed on the common problem addressed by both projects Link H/R and PLUG.In both cases, it is first y question to acquire and anal yze dynamicall y navig ation da ta/beha vior to extr act relev ant inf orma tion on users and their interests (observ ation/anal ysis) and secondl y to provide dynamic adapta tion rules (or recommenda tions) for a particular purpose (leading the user to add prod ucts in cart on web sites, improv e the visitor experience) with respect to the knowledg e constructed on the user and context.
We wish here to combine both approaches (top-down and bottom-up) for buil ding a high-lev el architecture based on a join tl y dev el oped model by designer and by a machine-learning process perf ormed on real beha vior of the user .The proposed hybrid approach consists in sharing the automa tons gener ated from the abstr action layer with the PerA c architecture whose purpose is to anal yse and modify the experience unf olding in par allel of the execution engine.Indeed, the designer will firs represen t the whole experience through the classical top-down approach (modeling each actan t and situa tion within the abstr action layer and implemen ts them in the implemen tation and dynamic layers).The execution engine will unf old narr ativ e pl ots with  respect to the graph.Each actan t implemen ted has its corresponding PerA c represen tation in the machine learning layer (the automa ton is shared from the abstr action layer) and while executing, he anal yses its own represen tation in order to perf orm improv emen ts of its action automa ta.When the automa ton has substan tiall y chang ed, the amelior ated automa ton is shared ag ain with the abstr action layer and the whole experience model is recom puted.This pertinence cycle will be iter ated during the whole experience execution.

Extensions: Time and spacEs
Such modeling requires to be extended in the context of ubiquitous applica tions.Indeed, if it is importan t to represen t the inter actions and conten t, the specificit of ubiquitous systems lies in the fact tha t users mov e in space to perf orm the actions and tha t time shoul d be taken into accoun t in a specifi way as far as the session can be la unched asynchronousl y and is often not limited in time.
Moreov er, in this type of experimen t the scalability is very importan t, i.e. the ability to add new conten ts or spaces can be di fficul t.Having a proper model and a writing tool for scripting and quickl y integr ate new elemen ts to the experience is necessary .This promotes more the reprod ucibility of the experience in a new place with new conten ts.
Thus, the specificit of adapta tion's approaches for ubiquitous and transmedia experimen ts, is tha t they passes through the foll owing four layers: inter action, conten t, space and time.If there are models for traditional approaches, such tha t game pla y or web documen tary ( [19 ]), no narr ativ e model sol ves problems specifi to the transmedia or the ubiquitous systems.
To add the tw o dimensions rela ted to time and space in the model, we will here propose some modific tions to the model.
The spacE E. The dimension tha t concerns space is the easiest to add to the model.We wish to represen t and easil y modify the places in which the narr ativ e situa tions unf old.Thus, the abstr action layer will need to embed a link betw een situa tions and places.Here, these are represen ted in a generic way and will ev entuall y be implemen ted with the actual constr ain ts in the implemen tation layer.For exam ple, in the abstr action layer, a giv en space will be "outdoors" (which can be a constr ain t for the implemen tation of the action as it is necessary to know the GPS position of users).Every place of type garden, courtyard or street can implemen t this place and be used as a basis for the situa tion.
Moreov er, this kind of mechanism is useful for setting up conten t tha t is, for the momen t, a limit of our model.Indeed, curren tl y, the conten ts are directl y represen ted in the inter actions associa ted with each actan t (through variables).The reprod ucibility of the experience requires the ability to easil y chang e the conten t (either exchang e or sim pl y modify).Another aspect is tha t you can easil y use a sim ple inter action sequence in another context (for exam ple, ask a question and expect an answ er).It is theref ore necessary to red uce the encapsula tion of the conten t in the actan ts to crea te a mapping betw een the inter actions and the externalized conten ts tha t will be easil y editable by the designer .
The time T .The structure of the represen tation model facilita tes the insertion of tem por al constr ain ts in the various layers.We can consider sev eral lev els for this time represen tation.
• time for the actor: we need here to represen t the duration and constr ain ts on actors' s actions.For tha t purpose, the time manag emen t consists in adding in each actor a clock represen ting time of the actor' s execution and the clocks required to represen t the time for each inter action.These actors can be users or inter activ e entities of the experimen t.Practicall y, since the transitions represen ting the actions of these entities are an extension of I/O automa tons, we onl y need to add the elemen ts of time manag emen t inspired from timed automa ta (clock, guard, invarian t) on inter actions represen tation.The situa tion automa ton resul ting from the presence of the actors (with timed inter actions) in tha t situa tion and gener ated by the synchronous prod uct will theref ore be a timed automa ton.
• the time of the situa tion: the presence of the actors (and their time constr ain ts in the situa tion) ind uces de facto a timed manag emen t of the situa tion and we need to incl ude a mechanism to foll ow the overall time in the situa tion.Then, an additional clock to monitor the time f owing in the situa tion and an invarian t on this time is necessary .This invarian t will guar antee, for exam ple, tha t the situa tion will execute no more than a specifi duration.
• the overall time: the last layer of the model on time, will be the one tha t integr ates the tem por al constr ain ts associa ted with the sequence of scenes.The same seman tics of timed automa ta can be used to represen t time manag emen t on the whole experience.Indeed, the challeng e here is to add a gl obal clock tha t will represen t the time of the whole experience and constr ain the pl ot graph described within the dynamic layer.This pl ot graph will then be a timed automa ton and will pass through pl ots with respect to guards and invarian ts on the gl obal clock.

Conclusion
In this article, we were interested in dev el oping a generic model easing the conception of softw are or robotic applica tions tha t coul d be depl oyed into museums.Thanks to our previous experience in the conception of inter actions with robots and a dancer (LINK H/R) or gamific tion in a museum (PLUG), we proposed a model, called "CITE" (stands for Conten t-Inter action-Time-spacE), which is based on a hybrid bottom-up/top-down approach.
To this poin t, onl y the CI (Conten t-Inter action) part of the architecture has been dev el oped and tested.In particular , a firs experimen t embedding this model in a robot has been perf ormed at the "Muséeum d'histoire naturelle " of La Rochelle in collabor ation with researchers interested in mar keting.The idea was to anal yze how the introd uction of a robot into a museum interf ere with the visitors represen tation of knowledg e, especiall y with young popula tions (teenag ers).
Concerning the TE part , it shoul d be dev el oped in the future: we intend to crea te a new experimen t in which tw o museums and an interdisciplinary team (mar keting, computer science, robotics, anthropol ogy, erg onom y) are involved.The interest of the CITE architecture woul d be to facilita te both the design of an applica tion by non specialists (mar keting, anthropol ogy) and the adapta tion to sev eral places and contexts (tw o museums with di fferen t collections).

Figure 4 .
Figure 4.The situation based approach.

5 CITE-
Content Interaction Time and spacE: a hybrid approach to model man-robot interaction for deployment in museums EAI Endorsed Transactions on Creative Technologies 07 2017 -10 2017 | Volume 4 | Issue 13 | e5

Figure 6 .
Figure 6.C, I, And E dimensions in the architecture for abstraction layer