Exploiting Data-Centric Social Context in Phone Call Prediction : A Machine Learning based Study

Context-awareness in phone call prediction can help us to build many intelligent applications to assist the end mobile phone users in their daily life. Social context, particularly, the interpersonal relationship between individuals, is one of the key contexts for modeling and predicting mobile user phone call activities. Individual’s diverse call activities, such as making a phone call to a particular person, or responding an incoming call are not identical to all; may differ from person-to-person based on their interpersonal relationships, such as family, friend, or colleague. However, it is very difficult to make the device understandable about such semantic relationships in phone call prediction. Thus, in this paper, we explore the data-centric social relational context generating from the mobile phone data, which can play a significant role to achieve our goal. To show the effectiveness of such contextual information in prediction model, we conduct our study using the most popular machine learning classification techniques, such as logistic regression, decision tree, and support vector machine, utilizing individual’s mobile phone data. Received on 01 November 2018; accepted on 01 February 2019; published on 19 February 2019


Introduction
Nowadays, mobile phones have become an essential part of our daily life.The number of mobile cellular subscriptions is almost equal to the number of people on the planet [9].According to [23], cellular network coverage has reached 96.8% of the world population and this number even reaches 100% of the population in developed countries.The explosion in the number of cell phones has made them the most ubiquitous communication devices and are considered to be "always on, always connected".However, the mobile phone users are not always attentive and responsive to incoming communication in the real life.The incoming calls sometimes cause interruptions for both the user and the surrounding people of the user.Such interruptions may create embarrassing situations not only in an official environment, e.g., meeting, but also affect in other activities like examining patients by a doctor, driving a vehicle, etc.These kind of interruptions may reduce worker performance, increase errors and stress in a working environment [9].According to the Basex BusinessEdge report [21], interruptions consume 28% of the knowledge worker's day, which is based on surveys and interviews conducted by Basex over the 18 months period encompassing high-level knowledge workers, senior executives at the end-user organizations, and executives at companies that produce Collaborative Business Knowledge tools.This translates into 28 billion lost man-hours per annum to companies in the United States alone.It results in a loss of $700 billion, considering an average salary of $25/hour for a knowledge worker, according to Bureau of Labor Statistics [1].In another study related to the execution time of primary tasks, Bailey et al. [2] have shown that when interrupted users require from 3% to 27% more time to complete the tasks, and commit twice the number of errors across the tasks.
Modeling and predicting user phone call activities based on social contexts, particularly, interpersonal social relationship between individuals is important, as it can be used in developing various real-life contactspecific applications to intelligently assist the end users.For instance, an "intelligent phone call interruption management system" could be a real-life application based on the relevant social contexts, which handles the incoming phone calls automatically according to the activities of an individual user.Another real-life application could be a "smart call reminder system" that intelligently searches the desirable contact from the large contact list and reminds a user to make a phone call to a particular user in a particular context, according to their past calling histories.In this work, we mainly focus on exploring the social relational context in machine learning based call prediction model, for the purpose of building contact-specific such applications to make the corresponding applications more effective and intelligent.
Let's consider an example for the mobile phone user, Alice.She works in a corporate office as an executive officer.In the morning, she attends a regular meeting (an event that happens at some specific time and place [10]) at her office on Monday.Typically, she rejects the incoming phone calls during that time period as she does not want to be interrupted with phone calls during the meeting.The reason is that the interruptions may not only disturb herself but also may disturb the surrounding other people in the meeting.However, if the phone call comes from her boss, she wants to answer the call as it likely to be important for her, even though she is in a meeting.Hence, the interpersonal relationship 'social relationship → boss has an influence to make her phone call decision.Similarly, other semantic social relationships such as family, friend, colleague, unknown, or others, may have the influence to take an individual's call handling decisions.
A number of researchers [7] [5] [25] have studied about the phone call activities of the users.However, these are not data-driven, and the prediction rules used in applications are not automated.In contrast, calling activity records in device logs are a rich resource for modeling individual's phone call activity.Recently, Sarker et al. [15] have highlighted that phone call log data having various contextual information, recorded by the mobile phones can be used as a context source to model mobile user activity, i.e., when a user accepts, rejects or misses the incoming phone calls.Individuals' such phone call activities may vary in different contexts, such as temporal, spatial or social contexts.For instance, Sarker et al. [14] [16] have focused on individual's phone call activities based on the temporal context by analyzing the time-series mobile phone data.Besides, the temporal context, the spatial context, e.g., user location [18], may also have the influence to predict an individual's phone call activity.In addition to this spatio-temporal context, the social context or interpersonal relationship between individuals can play a significant role to effectively modeling and predicting individual's diverse phone call activities, in which we are interested in this paper.To achieve our goal, we explore data-centric social context, i.e., how individuals' phone numbers available in the dataset, represents interpersonal relationships between them in order to model and predict their phone call activities based on machine learning techniques.The importance of this data-centric social context in mobile applications, has been highlighted in our earlier paper [13].As the activities of different individuals are not identical in the real world, predictions using machine learning techniques based on such contexts may differ from user-to-user according to their unique activity patterns.
The rest of the paper is organized as follows.Section 2 reviews the related works.In section 3, we provide an overview of various social relational contexts in our real-world life.In Section 4, we discuss machine learning based modeling including the generation of data-centric social context.We report the experimental results in Section 5. We discuss two real-world contactspecific applications based on social relational context in Section 6, and finally Section 7 concludes this paper.

Related Work
A significant amount of research has been done on various context-aware applications relevant to phone call activities.Khalil et al. [8], have shown the usefulness of interruption management system by conducting a user survey.In their survey, they investigates context disclosure and sharing patterns for context-aware telephony with the aim of decreasing interruptions and enhancing agreement between callers and receivers.They found the low availability rate of the participants to receive cell phone calls (only 53% of the time).Toninelli et al. [22] have reported a survey of activity-based response to incoming calls of different users and show that maximum users do not want to interrupt while in meeting or working in a team or outdoor activities like driving or sleeping and ignore incoming call in these situations.
A number of authors have studied about the systems that can manage these interruptions.However, such research does not use rules produced by automatically analyzing call log information.For example, Khalil et al. [7] use calendar information to infer user's activity and to automatically configure cell phones accordingly to manage interruptions.Dekel et al. [5] design an application to minimize mobile phone disruptions.In [19], the authors proposed a context-aware configuration manager for smartphones PYP (Personalize Your Phone).PYP can decide to block a phone call without bothering the user by using rules.However the rules should be predefined manually in their system.Another intelligent interruption management system is proposed in [25] that uses decision tree for making decisions intelligently.However, in their approach they also use predefined rules from users rather than auto log data rules.
Besides these approaches, in [20], the author presents a new approach to smartphone interruptions that maintains the quality of mitigation under concept drift with long-term usability.The approach uses online machine learning and gathers labels for interrupt causing events (e.g., incoming calls) using implicit experience sampling without requiring extra cognitive load on the user's behalf.Another research is quietly different based on UI (User Interface).Authors present a multiplex UI for handling incoming calls on smartphones [3].This design solution tackles the problem that calls can interrupt concurrent application uses.They extended the options for handling incoming phone calls and presented considerations for possibilities to postpone calls and multiplex the call notification with the concurrent app.Pejovic et al. [9], design and implement an interruption management library for Android smartphones.Their library represents a good starting point for identifying opportune moments for interruption but uses a number of sensors like GPS, Bluetooth, Wifi, accelerometer etc.These research do not take into account the social context, particularly, the data-centric social context for the purpose of building context-aware intelligent applications.Unlike these works, in this paper, we focus on the importance and usefulness of such social context for modeling and predicting individuals diverse phone call activities based on their interpersonal social relationships, using various machine learning techniques.

Social Relational Context: An Overview
Typically, human beings are naturally social creatures.They build interpersonal relationships among themselves by nature.Thus interpersonal relationships can be treated as social associations, connections, or affiliations between two or more people.Based on the discussions highlighted in [13], there are several categories of interpersonal relationships that might have the impacts on individuals phone call activities.These are: • Family Relationship: According to [13], family is defined as a domestic group of people either by legally bond or blood bond.Legally bond in a family is related to marriages, adoptions, and guardianships, including the rights, duties, and obligations of those legal contracts.On the other hand, blood bond includes both close and distant relatives such as siblings, parents, grandparents, aunts, uncles, nieces, nephews, and cousins.The phone call activities of an individual may differ from person-to-person.For instance, one's phone call activities with her mother may not be similar with her cousin, even though both are in a same family.
• Friendship Relationship: Friendship is another interpersonal relationship in our societies.This is a social relationship where there is no specific formalities.Individuals may create friendship relationship between them according to their own choice or interests, and can enjoy one another's presence [13].In the real world, the phone call activities of an individual may differ from friendto-friend.For instance, one's phone call activities with her close friend may not be similar with her another friend.
• Love or Romantic Relationship: If an interpersonal relationship between individuals is characterized by passion, intimacy, trust and respect, then we call it love or romantic relationship [13].In the real world, individuals in a romantic relationship are deeply attached to each other and share a special bonding in their life, like boyfriend, girlfriend or other significant person.Thus, the phone call activities may differ accordingly based on their relationships.
• Professional or Work Relationship: Building professional relationships in an organization is a crucial thing in our daily activities [13].An individual may have professional relationship with many people.However, the phone call activities of an individual may differ from colleague-tocolleague.For instance, one's phone call activities with her boss may not be similar with her another colleague.
• Unknown or Others: The people in this category have no significant value in terms of interpersonal relationship, however, this category can be a major part in terms of population [13].Thus individuals in this category might have an impact on phone call activities in our real life.The reason is that by using a particular phone number one can easily communicate with another one in the world.For instance, one may answer a phone call from an unknown person or a person with whom there is no specific relationship.Thus, the phone call activities with unknown people may not be similar with the people having a particular interpersonal relationship discussed above.
The various real-life interpersonal relationships discussed above are relevant to predict individual's diverse phone call activities.However, as mentioned earlier, it is very difficult to make the device understandable about such semantic relationships in phone call prediction.Thus, in this paper, we explore the data-centric social relational context that is generated from the available mobile phone data, and use this social context to build a machine learning based prediction model.

Machine Learning based Model
In the area of mining mobile phone data, classification is a supervised learning method and can be used to model mobile phone user behavior based on contexts [12].It plays a crucial role in predicting user behavior based on relevant contexts from a given data set.In this work, we employ three most popular machine learning classification techniques for the purpose of our study.These are Logistic Regression (LR), C4.5 Decision Tree, and Support Vector Machine (SVM) [24].Although, our main focus is exploring data-centric social context, we also take into account the temporal context [16], and spatial context [18] [6] that are available in the datasets to model user activities using the above mentioned machine learning techniques.For temporal context, we pre-process the time-series data using our earlier behavior-oriented time segmentation technique [16].For spatial context, we use the nominal values of user's location available in the dataset [6].In addition to this spatio-temporal context, in the following we discuss about data-centric social relational context in order to build an effective machine learning based model.

Data-Centric Social Relational Context
In the real world, the semantic relationships discussed in the earlier section, may vary not only from personto-person but also from geographical region-to-region.For instance, an individual user in one region may call her mother 'mom', while another individual in another region may call as 'mammy'.Thus, for a specific relationship, it may differ semantically depending on their own culture in the societies [13].Moreover, for a particular relationship one may have different categories.For instance, the relationship 'friend', can be in different categories, such as close friend, best friend or others.The phone call activity of an individual user may not be the same with them all, even though they are in similar relationships 'friend'.Thus, such kind of semantic relationships, may not always be useful, for making phone call decisions in the real world.As individuals phone call decision may vary from userto-user, we take into account unique user identifier for exploring data-centric social context in this work.
In the real wold, a mobile phone number represents as the unique user identifier [13].For instance, one's number is well different with another one, even for a single digit in the number in order to avoid conflict.Mobile phones automatically record the phone numbers in it's phone log for each phone call activity.In this work, we generate the value of social context from the phone call data, and call it data-centric social context, which represents one-to-one relationship based on individuals mobile phone number.Thus, according to [13], the main principle to identify the value of such social context is "each unique mobile phone number represents a particular one-to-one relationship".This has been set out in Algorithm 1.

Algorithm 1: Data-Centric Relationship Generation
Data:  1, can be used in various context-aware mobile applications in order to assist the mobile phone users in their diverse phone call activities, according to their person-to-person relationships.

Experimental Results
We have conducted experiments on the real mobile phone datasets of individual mobile phone users.In the following, we briefly present the experimental results and discussion.

Datasets and Evaluation Metric
We have conducted experiments on five phone log datasets, collected by Massachusetts Institute of Technology (MIT) for their Reality Mining project [6].These are represented as D1, D2,...,D5 in our experiments.These datasets are collected over a period of 9 months, which include individual's diverse phone call activities, such as accepting an incoming call, rejecting an incoming call, missed call, and making an outgoing call, and corresponding contextual information that are used in our machine learning based model.In order to measure the prediction accuracy of the machine learning based model using the data-centric social context, we compare the predicted response with the actual response (i.e., the ground truth) and compute the accuracy in terms of: • Precision: ratio between the number of phone call activities that are correctly predicted and the total number of activities that are predicted (both correctly and incorrectly).If TP and FP denote true positives and false positives then the formal definition of precision is [24]: • Recall: ratio between the number of phone call activities that are correctly predicted and the total number of activities that are relevant.If TP and FN denote true positives and false negatives then the formal definition of recall is [24]: • F-Score: a measure that combines precision and recall is the harmonic mean of precision and recall.The formal definition of F-measure is [24]: • ROC curve: The purpose of ROC (Receiver Operating Characteristic) Curves is to examine the performance of the machine learning classifier, by creating a graph of the True Positives vs. False Positives for every classification threshold [24].

Prediction Results
In this experiment, we show the prediction results of various machine learning based techniques by taking into account the data-centric social context in the model.For this, we have shown the prediction results in terms of precision, recall, f-score, and ROC value, defined above, for various classification techniques, such as Support Vector Machine (SVM), Logistic Regression (LR), and Decision Tree (C4.5), in Table 2, Table 3, and Table 4 respectively.The results show that using the generated data-centric social context is able to predict individual's phone call activities effectively in various machine learning based models, according to the activity patterns of individuals.

Impact of Data-Centric Social Context in Prediction Model
In this experiment, we show the impact of data-centric social context in the prediction model.For this, we show the comparison results of both before and after incorporating this context in the model.Figure 1, and Figure 2 show the graphical representation of prediction results in terms of F-Score values, for all these three machine learning techniques.The results show that it improves the prediction results for all these classification techniques when using data-centric social context in the model.In addition to the comparison of F-score values, Figure 3, and Figure 4 show the graphical representation of prediction results in terms of ROC Value, for all these three machine learning techniques.The results also show that these classification techniques give better prediction results when using the data-centric social context in the model.
Based on the prediction results for different datasets, shown in Figure 1, Figure 2, Figure 3, and Figure 4, it is concluded that data-centric social context can be used

Examples of Real-world Applications
In this section, we discuss how the data-centric social relational context can be used in personalized mobile applications.The followings are the two reallife personalized applications for mobile phone users related to their phone call activities having the role of data-centric social relational context.

Application 1: Context-Aware Intelligent Call Interruption Management System
Mobile phones are considered to be 'always on, always connected' device but the mobile users are not always attentive and responsive to incoming communication [4].For this reason, sometimes people are often interrupted by incoming phone calls which not only create the disturbance for the phone users but also for the people nearby.Such kind of interruptions may create embarrassing situation not only in an official environment, e.g., meeting, lecture etc. but also affect in other activities like examining patients by a doctor or driving a vehicle etc.Sometimes these kind of interruptions may reduce worker performance, increased errors and stress in a working environment [9].To circumvent this problem, people resort to switching their devices off altogether in environments such as meetings, or to configure silent mode that provide unobtrusive alerts of incoming calls which can be ignored if desired.However, these manual solutions have some shortcomings.First, missed calls are common, as the caller has no way of knowing before dialing whether the other party is available and willing to talk.Second, the recipient of the call typically has very limited information upon which to judge the importance of incoming calls.Thus an automated context-aware intelligent system is needed to automatically handle the incoming phone calls.
In the current state-of-the-art approaches, users need to define and maintain the social relationships and corresponding mobile phone configuring rules manually for their applications, which are static, i.e., the rules are not automatically discovered from the data.In general, users may not have the time, inclination, expertise or interest to maintain these rules manually in the real world [17].In contrast, the data-centric social relational context can be discovered from individuals calling activity records in device logs, e.g., phone call logs, and corresponding mobile phone configuring rules can be generated using data mining techniques utilizing such mobile phone data.Using such rules based on the data-centric context can be used to build a smart call interruption management systems, in order to adjust the modality of cell phone configuration, such as ringing to silent/vibrate or viceversa.In such system, the mobile phone behaves differently from person-to-person according to the data-centric social relational context, which could be more effective to provide the personalized services intelligently.

Application 2: Context-Aware Intelligent Call Reminder System
Forgetting to make a phone call, i.e., outgoing call, is also a common problems in individual's daily life [11].This could either be an event-based calls, such as making a phone call to plan for a meeting or seminar, or calling someone in a birth-day party or in a marriage anniversary, etc., or a nonevent-based calls, such as making a phone call to parents or significant persons at night or on weekends, calling girlfriend or boyfriend during a lunch break or dinner or a particular time periods in weekdays or weekends, calling someone in a particular location, e.g., calling only close friends in a particular playground or in a specific restaurant, etc.Thus a context-aware call reminder system is needed to intelligently remind the users to make a phone call to a particular person.
The data-centric social relational context can play a significant role to provide the personalized services in such context-aware system.The social relational context can be discovered from individuals calling activity records in device logs, e.g., phone call logs, and corresponding call reminder rules can be generated using data mining techniques utilizing such mobile phone data.Using such reminder rules based on the data-centric context can be used to build a smart call reminder in order to intelligently searches the desirable contact from the large contact list and reminds a user to make a phone call to a particular person according to that context.Thus, the mobile phone behaves differently from person-toperson according to the data-centric social relational context, which could be more effective to provide the personalized services intelligently.

Conclusion
In this paper, we have explored the data-centric social context that is generated from the mobile phone data.We have conducted our study using the most popular machine learning classification techniques, such as logistic regression, decision tree, and support vector machine, utilizing individual's real life mobile phone data, in order to show the effectiveness of such contextual information in prediction model.We believe that data-centric social context can play a significant role for the application developers for building corresponding contact-specific real-life applications in order to provide them various contextaware personalized services in their daily activities.

Figure 1 .
Figure 1.Prediction results of machine learning techniques in terms of F-Score for different datasets based on temporal and spatial contexts.

Figure 2 .
Figure 2. Prediction results of machine learning techniques in terms of F-Score for different datasets based on temporal, spatial and data-centric social contexts.

Figure 3 .
Figure 3. Prediction results of machine learning techniques in terms of ROC value for different datasets based on temporal and spatial contexts.

Figure 4 .
Figure 4. Prediction results of machine learning techniques in terms of ROC value for different datasets based on temporal, spatial and data-centric social contexts

7
Exploiting Data-Centric Social Context in Phone Call Prediction: A Machine Learning based Study EAI Endorsed Transactions on Scalable Information Systems 12 2018 -03 2019 | Volume 6 | Issue 20 | e8

Table 1 ,
X n // including mobile phone numbers Result: A list of generated values Rel list shows a number of mobile phone numbers and corresponding data-centric social relational values generated according to Algorithm 1.In Table1, we have shown the values of data-centric relationship (social context) as {Rel A , Rel B , ..., Rel E }, according to the uniqueness in the given mobile phone numbers.
5 res ← checkU nique(new) //check the uniqueness; 6 if res is true then 7 rel ← assignRel(num); // assign relationship 8 add rel to the list Rel list ; For instance, mother's phone number (047XXXX231) represents one relation (Rel A ), while friend's phone number (047XXXX232) represents another relation (Rel B ) etc.Even this data-centric value is also able to distinguish the particular type of friendship.For

Table 1 .
Sample examples of individuals' data-centric social relational contexts and corresponding semantic social relationships D ), which are shown in Table1.The corresponding sample semantic relationships are also shown in Table1for human understanding.The generated values of data-centric social context shown in Table

Table 2 .
The prediction results for Support Vector Machine (SVM)

Table 3 .
The prediction results for Logistic Regression (LR)