The analysis of contextual factors on the use of smartphones applications

The development of methodologies and techniques to evaluate smartphones usability is an emerging topic in the scientific community and triggers discussions about which methodology is most appropriate. The lack of consensus is due to the inherent difficulty on capturing context data in the scenarios where the experiments take place and on relating them to the results found. This work aims at correlate potential usability problems in mobile applications with contextual factors that may occur during users’ interactions on different devices, such as environment luminosity, device screen resolution, and the user’s activity while interacting with the application. The following methodology was applied to carry out a field experiment: (1) identification of contextual factors that may influence users’ interaction; (2) use of UXEProject Infrastructure to support the automatic capture of applications’ context data, by monitoring and storing quantitative, subjective and contextual data from applications’ use; (3) implementation of experiments with real users, which have different profiles, using three different mobile applications over an one year period. In this paper, we present and discuss the results obtained during this study.


Introduction
With the continuous advance of wireless networks and the great proliferation of smartphones, many applications are launched in the market every day.Nowadays, the requirements for the ubiquitous computing, which predicts software as part of people's daily life and available, transparently, "at anytime, anywhere and from any device" [1], have been increasingly explored to build these applications.This application's ubiquity is reached by automatically monitoring contextual information related to the use of these applications.A context could be defined as a set of information that affects an application execution, related to people, objects, places, time and space in which the application is used [2].
Collecting data from smartphone users' experiences and associating them to the context where the interactions occur is a great challenge for the Human-Computer Interaction (HCI) area.The situations change and the results from the tests are highly dependent on the context.A person who is interacting with a mobile application while sitting on his home sofa, for instance, will have different external interferences when compared to the same task done while walking on the street.
As noted by J. Hansen [3], it is very important to relate the context influence on the users' interactions with mobile applications.To conduct the studies with such a broad reach, it is necessary to use methodologies and techniques that allow to carrying out experiments that are able to collect data contextualized with the scenarios where the interactions take place [4].This fact provokes several discussions regarding the place where experiments are conducted (in the field or in the laboratory) [5], as well as the techniques  To identify the main evaluation approaches used to assess smartphones applications. To use the UXEproject infrastructure, as a new approach created with the potential to extract and relate quantitative, contextual and subjective data. To present the results of experiments conducted in the field, relating contextual factors to usability metrics for smartphone applications.
The remainder of this paper is divided into six sections.Section 2 presents the state of the art concerning evaluations of smartphone usability, encompassing the investigation of the approaches used to carry out the usability experiments.Section 3 introduces the UXEproject infrastructure adopted to facilitate the experiment presented in this work.Section 4 describes the methodology used for the execution of the experiment.Section 5 summarizes and discusses the experimental results obtained.Finally, conclusions and future prospects are presented in Section 6.

The State of the Art
The relationship between context and usability is an issue widely discussed by the scientific community which studies the influence of scenarios regarding the interaction with smartphones.According to Mallick [7], the experiences show that human beings usually interact with systems in unusual ways.Thus, user interaction tests conducted in real scenarios are essential to delineate the users' preferences and the consequent adaptation of products addressed to them [5].Kawalek et al. [8] suggest evaluation methods that encompass different observation angles in the experiments done in this area, such as quantitative data (usability metrics), the subjective evaluation (users' feelings) and contextual data (for example, environmental conditions and the devices' characteristics).The main problem is the lack of literature covering approaches that support these three requirements combined in a single experiment.Generally, only one or two of them are related.
Coursaris and Kim [9] carried out a systematic data survey, from 2000 to 2010, which allowed them to identify that 47% of the works that evaluate mobile devices are done in the laboratory, 21% in the field, 10% used both scenarios and 22% are conducted without the participation of users.A point to be observed is that many studies don't consider the mobile feature of such devices, applying traditional evaluation methods.Another fact which calls attention in the results presented is that 47% of the studies evaluate individual and out of context tasks, 46% are based on the technology used and only 14% consider context variables and the users' characteristics.[12] Observation and Interview Chin and Salomaa (2009) [13] Logging and Survey Lai et al. (2009) [14] Survey and Interview With the objective being to identify the current reality of the usability investigations related to smartphones, a study was developed between the years of 2008 and 2014, encompassing works that describe empirical experiments and investigate at least one of the following usability attributes: efficiency, effectiveness, satisfaction, learning, operability, accessibility, flexibility, usefulness and ease of use.The publications venues investigate were from the ACM, IEEE, Springer, and Google Scholar.Thirty-one works were selected, and they are listed in Table 1 along with the investigation techniques used.
The results of this study are detailed as follows: experiments investigated contextual data, and were conducted in the laboratory, meeting the expectations and desires of a great number of researchers. The last issue to be pointed out is that none of the approaches captures the users' impressions concerning the application usability during their interactions, which could provide the correlation between the subjective data in the evaluations.
The main observation of the systematic study conducted was that in the majority of the experiments, surveys are used to collect data, which might complicate the correlation between different kinds of information in order to find out usability problems [41].Furthermore, in most cases, contextual factors are not investigated, an issue which is defended by many researchers as a primary factor for advances in the usability evaluations area [3][4] [9].
In order to address some of those issues related to usability evaluation, next section presents the UXEProject infrastructure.This infrastructure was conceived to support quantitative, contextual and subjective extraction and correlation data to a better understanding of the real use behaviour of the analysed applications.

The UXEProject Infrastructure
The UXEProject infrastructure was built to give support to the usability evaluation based on the analysis of data captured directly from the devices.The formal model, which originated the infrastructure, can be found in full in [42].
The UXEProject infrastructure is conceptually divided in three units.These units comprise: 1. Mapping of tasks that will be investigated; 2. Combination of traceability metrics which enables the capture of contextual data, usability statistics and subjective information regarding the experiences provided to users and; 3. Storage and analysis of data captured during the experiments.
The infrastructure model uses a component-based architecture to support the reuse of the implemented resources and its redefinition according to evaluators' needs.Figure 1 presents the three high level components that represent the infrastructure and their relations: Mapping Unit, Traceability Unit and Assessment Unit.Arrows indicate information transfer between the components.

Figure 1. Main components of the infrastructure
To enable the automatic data capture concerning the user's interaction and the use of sensors present in smartphones, a Metric Library, developed with Aspect-Oriented Programming (AOP), is used.
In the infrastructure, the mapping of tasks is built through the capture of methods executed in the application that will be evaluated.The Evaluation Team is responsible for the choice and mapping of tasks, as well as the creation of data capture metrics.It is important to emphasize that it is not necessary to have programming experience to carry out these activities.
The following subsections describe the infrastructure and tools used to encompass the predicted components in the three infrastructure units.

Mapping Unit
This unit is subdivided into three components responsible for providing the task-mapping functionalities.In Figure 2 the components diagram can be seen, with its interfaces and doors for data transfer from one component to another.

Figure 2. Mapping Unit Architectural Overview
Initially it has been planned that the applications' source code be made available.The objective of this action is to allow the Source Code Analyser component to identify which classes refer to the treatment of users' interactions.
These classes are identified and provided as a requirement for the Mapping Code Generator component so that it can attach information that enables the identification of tasks to the original source code.
The result of this component's action is to perform the instrumentation of the application's original code, thereby allowing the tasks to be mapped.
After the source code is prepared for mapping, it must be forwarded to the Mapping Device component where the tasks will be mapped.This device must enable the mapping of tasks and make them available to be used by other units in the model.
In Figure 3, the diagram portrays the sequence that enables the exchange of messages and data between the Mapping Unit components.
The first tool developed in the infrastructure encompasses the source code preparation to enable the mapping of tasks provided in the applications.This tool was named Mapping Aspect Generator (MAG).
The MAG tool imports the source code from the application to be mapped and creates an Aspect that inserts the method onUserInteraction † in the classes that refer to the interaction layer.This process allows detecting the users' actions.In order to have the application ready to be mapped, it is necessary to compile the application source code with the Aspect generated.After that, it is enough to embed the application in a smartphone to make the interactions.So that the Evaluation Team maps the tasks, another tool, named Automatic Task Description (ATD), was developed.The ATD should be embedded in a device and executed simultaneously with the application that will be mapped.Thus, as the Evaluation Team interacts with the application, the methods executed are automatically captured to be used as steps for the conclusion of a task.
The ATD method consists of the use of a filter that identifies when there is a user interaction.The filter identifies which classes, methods and parameters of the application were used.This information is stored in a XML file, which will be sent to the server to be used in the creation of metrics.

Traceability Unit
This unit is responsible for collecting different types of information: (i) user profile (e.g.education and age); (ii) data referring to the user's interaction, such as hits and time to complete a task; (iii) contextual data like luminosity and noise; and (iv) subjective data, related to the users' feelings regarding the applications' usage.
For the instrumentation of the source code to include the code for capture metrics and data collection, the Traceability Unit was divided into three internal components, as can be seen in Figure 4.

Figure 4. Traceability Unit Architectural Overview
The Metrics Library is the component that provides the structure to all capture metrics.It uses the mapped tasks to bring together the structure of the metrics with information about the application source code.It generates, as result, the metrics' structures adapted to the application under evaluation.In the development of the Metrics Library, the incorporation of three types of structure was expected: (i) the ones used to capture quantitative data from users interactions; (ii) the ones responsible for the subjective data (direct interactions with users) and; (iii) the metrics that allow instrumentation of the available sensors.
The Metrics Generation component receives as input the application's source code and the adapted metrics structure to be attached to this code.As result it provides a new source code with the inserted metrics, the Traceability Code.
The component Interaction Device requires the installation of the application's new code containing the metrics that will be used for data capture.This device should enable the user's interaction while allowing data to be captured.
In Figure 5, the diagram presents the sequence of actions that permits the exchange of messages and data between the Traceability Unit components.
The tool designed to allow the instrumentation of applications and to enable the data capture was named UXE Metrics Generation.This tool contains a library which has the structures of metrics to perform the measurements.
Initially, the tool has as input the XML file generated in the Mapping Unit.Then, the existing methods in the XML file are connected to the Metrics Library available in the tool, allowing the creation of Aspects responsible for the capture, transmission and persistence of data.At last, it is sufficient to compile the application's source code along with the Aspects generated and to embed the application in a device that will be utilized by a user.

Figure 5. Steps for the instrumentation of an application with the metrics for capturing data
To encompass the data collection, three types of metrics were defined.The usability and context metrics use the Logging technique [10] [11], and the subjective metrics use the Experience Sampling Method (ESM) [43] technique.
The efficiency and effectiveness of the users' interactions are measured through usability metrics described in Table 3.All the measurements take into consideration the mapping of a task, where all the steps to its conclusion are described.Thus, the errors, for instance, are identified every time the user interacts with the application that is not in the previous mapping.The sensors used to capture the contextual data were the accelerometer (to capture the horizontal or vertical position), GPS (to capture movement), Luminosity Sensor (to capture the environment's luminosity), and the microphone (to capture noise in the environments).
It is important to stress that the infrastructure uncovers new metrics to be incorporated in the Metrics Library, which increases its adoption in different scenarios, contexts, and with the use of other sensors.For example, specific metrics could be incorporated in an application for the spatial orientation of people.Therefore, such metrics could be associated with the data provided by a mobile device's GPS, enabling the comparison of the user's interactions with information regarding positioning, speed, and route taken.In that case, the usability analysis would take into account not only how a user performs a task but also how the environment influences the user's interactions.
The subjective metrics are used to measure the emotional state of the users during their experiences with an application.ESM [43] technique was chosen because of the following aspects: (i) it is appropriate for use on devices with relatively small screens; (ii) it is intuitive and doesn't take much mental effort to be interpreted; and (iii) it is capable of being answered with entry modes provided by different mobile devices.
ESM measures two dimensions, the kind of emotion (positive or negative) and the intensity of the emotion.To do this, a group of pictures is displayed indicating emotional states associated with a question, as can be seen in Figure 6.The sequence of the pictures represents varying degrees of emotional intensity and can be interpreted from left to right as: very displeased, displeased, indifferent, pleased, very pleased.These questions are defined by the evaluators during the execution of the Usability Metrics Generation Tool.In order that the data related to the experiments could be transmitted and stored on a database, a micro instance from the service known as Amazon EC2 ‡ was utilized.

Assessment Unit
This unit was structured in four different components that are responsible for infrastructure where data to be evaluated ‡ Available at http://aws.amazon.com/ec2/will be stored over time.Figure 7 presents an architectural overview of its components.To illustrate better its behaviour, the sequence diagram in Figure 8 presents the exchange of messages and data between the components.

Figure 8. Actions performed by the components of the Assessment Unit
To encompass the components defined in the Assessment Unit, the following processes were performed: (i) create and setup an FTP and a database (DB) server and make them available on the Internet; (ii) carry out the modelling of a DB and a Data Warehouse (DW) to store and enable the analysis of information captured during the experiments; (iii) create tools to detect the presence of new files in the FTP server, populate the DB and load the DW; e (iv) choose an OLAP tool to give support to the data analysis.
The Database Management System selected to store the data was the MySql Community Server §.In order to encompass the load of data on the DB, a tool named Data Load was developed.The steps executed by this tool are: detect the arrival of new files in the FTP server, extract the data and load them into the DB.
The last tool designed (ETL Maker) extracts, transforms and loads the data, transferring them from the DB to the DW.To make the data analysis easier, the OLAP tool Pentaho Analysis Services** was chosen.
The next section describes the case studies performed to evaluate the functionalities and identify the potentials of the UXEProject Infrastructure with the tools that give support to its implementation.The results of the experiment were presented in [44].

Experiment Conducted
The experiment reported in this article was divided in six different phases, based on the directives proposed in the DECIDE framework [45], which guided the specification of the steps during all phases of the experiment.

Determine the goals
The main focus of the experiment is to obtain information relating different kinds of data, with a major interest in the contextual factors which can interfere in the usability of the applications that will be analysed.

Explore the questions
Based on the objective to be reached, a set of questions was made to direct the experiments, and the data generation and analysis:  How does the luminosity of the interaction's scenario interfere in the performance of smartphone application users? How does the user's movement interfere in his performance to interact with the applications? Which tasks are more affected by the position of the smartphone at the moment of the interactions? What is the difference in users' performance due to the smartphone setup?§ Available at www.mysql.com** Available at www.pentaho.com What kind of information can the context provide to improve the usability analysis?

Choose the evaluation paradigm and techniques
The evaluation approach to be used in this work should encompass the following conditions:  The experiment has to be conducted in the field. Without supervision. For a long period of time. Data will be collected automatically.
 No restrictions concerning the number of users.
 No need to know how the applications were developed. Potential to be applied to any application for the Android platform. No need to have the Evaluation Team to write the programming codes. Possibility to analyse different kinds of data. Possibility to specify the tasks to be analysed.
In the face of the listed conditions, the UXEProject infrastructure was chosen, as it gives support to all requirements demanded.

Identify the practical issues
In this phase, a great number of prerequisites were raised, among which it can be highlighted: (i) the selection of applications to be evaluated; (ii) the definition of the investigated tasks; (iii) the definition of the group of users to participate in the experiment and; (iv) the data to be considered.
The first action in this phase was to conduct exploratory research aiming to find applications that have attractive functionalities and with the possibility to be inserted in people's daily life.The choice of applications considered the following prerequisites:  The application must have been developed using the Java language and for the Android platform. The application source code must be available and have explicit rights of use. The application must have been built using good programming techniques, showing a good modularization of its functionalities, to allow the source code to be instrumented with AOP.

Description of Applications and Evaluated Tasks
The first application, called Mileage, aims to help the users to control their costs with fuel and other maintenance services of an automobile, such as oil change, brake pad change, among others.The left side of Figure 9 presents the application interface, and the right side, the tasks investigated in the experiment.The second application selected was ^3 (Cubed), a music and video clips manager.On its main menu, it is possible to select songs or videos and play them.Cubed interface and the tasks that have been mapped in this application are presented in Figure 10.

^3 (Cubed) Interface
Instrumented Tasks  The last application chosen for the experiment was Shuffle, whose interface is shown in Figure 11.It is an application which schedules the activities that allow to link tasks to dates and times, besides permitting the association to projects and contexts (for example, home or work).

Shuffle Interface
Instrumented Tasks

Selection of participants
Another action completed in this phase was to define the group of users to participate in the experiments.The selection considered the profiles that were under analysis and their smartphones' features.Twenty-one participants were selected, taking into consideration the age, educational grade, education level, occupation and purchasing power.

Data Considered in the Experiment
The relationship of the data used in the experiment was defined according to the capture strategies provided by the UXEProject infrastructure.Thus, the usability data was considered related to the mapped tasks, the users' profile, the smartphones' features and the contextual data obtained through sensors.The smartphones' features considered to compose the interactions context were the screen size and resolution, whose ranges of values are in Table 4.The analysis of contextual factors on the use of smartphones applications 9 So as to contextualize the environment where the interactions take place, the data are captured considering the degree of luminosity, the device position during the interactions and the speed in which the user moves.These context data are captured directly from the devices sensors and their reference values are described in Table 5.

Decide how to deal with ethical issues
A site was built for the conduction of the experiment.It brings explanations concerning the research and a term of use of the applications.In order that the user is enabled to download the applications, it is necessary that he explicit his agreement on participating in the experiment.

Evaluate, Interpret, and present the data
The next section presents the results of the evaluations made during the experiment conduction.The data collection took place from 12/01/2011 to 11/30/2012.

Luminosity Influence Analysis
Initially, observed were the percentage values of tasks completed with errors in each application regarding the luminosity variation.
The objective is to identify the possible influence of this contextual variable on the interactions.In order to conduct the analysis, the luminosity was isolated and related to the percentage of tasks completed with errors in each application, as presented in Figure 12.

Figure 12. Errors rate due to luminosity
As we can see, it was detected that, for all applications, the highest rates of errors in completed tasks occur when the luminosity is either too high or too low, that is, when the interaction scenario's conditions are not within the parameters considered standard, which proves the luminosity influence on users' performance.

Movement Speed Analysis
The next evaluation refers to the speed in which users move when performing the interactions.The speed usually varies due to three possibilities: the user is either walking, or stationary, or in any kind of means of transportation (Figure 13).
As seen in Figure 13, it is possible to identify in all applications that the actions performed with no movement show a lower error rate than the ones performed while moving.

Position of Interaction Analysis
The second analysis regards the smartphone position during the interactions.The aim is to find usability problems in specific tasks related to the interaction position (vertical, horizontal or mixed).In Table 6, it is possible to see the tasks which had an error rate over 10% related to the position of interaction.This sort of information is useful for the application developers, as in future versions of the applications, the interactions in positions of high error rate can be inhibited.Table 6 shows that more than 50% of the problems occur when the tasks are done in a mixed position, that is, they are started in a position and ended in another.

Smartphones' physical characteristics analysis
The following analysis verifies the existence of contextual variables interference related to smartphone characteristics, such as, screen resolution and size.In order to carry out this evaluation, the tasks executions were investigated considering the smartphones' characteristics.The data presented in Figure 14 allows identifying that the screen resolution influences significantly in the tasks execution speed, that is, the higher the resolution, the faster the tasks are concluded.We can observe, as for the application Cubed, which the high resolution increases, in average 26.03%, the task speed when compared to the low resolution.In the Mileage application, this difference is 19.66% and, in the Shuffle, 17.17%.The same analysis made earlier was designed to verify the screen size influence on the users' interactions.When observing Figure 15, it is possible to see that the screen size is another contextual variable which influences the performance of users.In the Cubed application, the average speed for the task execution decreases around 4.1 seconds when compared to the use of small screen smartphones.In the Mileage application, this difference is apparent around 6.9 seconds, and, in the Shuffle, the difference is 10.9 seconds.

Figure 15. Tasks execution speed due to the screen size (in seconds)
A fact observed in the smartphone market is that, normally, the phones with smaller screens also have lower resolution.Thus, the users' performance was observed considering the two variables simultaneously.The metrics used to measure the performance was the percentage of tasks completed with error.The graph in Figure 16 shows that the smaller the size and the lowest the resolution of the

Profile of participants analysis
When analysing the rate of tasks executed with errors along with the profile of participants, an intriguing fact was observed.The occurrence of errors in the low social class is greater than in the other classes.In order to search for an explanation for this result, the kind of device used in the experiment by these participants was investigated.The conclusion was that the rate of errors was not related to the users' purchasing power, but to the low resolution of the device's screen.As the majority of the people with low purchasing power used low resolution smartphones, an isolated analysis of the social class can lead to wrong conclusions.

Figure 17.
Relationship between the error rates due to purchasing power X screen resolution In Figure 17, it is observed that, regardless of the purchasing power, the errors are more frequent when low resolution smartphones are used.This analysis characterizes one of the potentialities of the UXEProject infrastructure as it permits to associate different contextual factors in a single evaluation, decreasing thereby the possibility of wrong conclusions.

Conclusions and Prospects
Based on the data presented in Section 2 of this article, the conclusion is that the majority of experiments made to evaluate the usability of applications for smartphones use surveys to collect the data and there is no correlation between the contextual variables and the usability problems observed.This fact is contrary to the expectations of many researchers in this area.
The results obtained in the experiment showed that the UXEProject infrastructure is a good solution for the investigation of usability problems associated to different types of data, highlighting the data collection using the smartphones' sensors.
With the experiments' results, it is observed that approximately 70% of the interactions occur when users are stationary, having the device in a single position and with a normal environment luminosity.However, when these contextual factors change, the users make more mistakes and take longer to execute the tasks.This information suggests that the applications should, for example (i) make interactions impractical in positions which offer more probability of errors, forcing users to interact in an appropriate position; (ii) detect the external luminosity and try to balance the luminosity radiated by the device in order to guarantee a good visualization; and (iii) identify the user's movement and only enable the most usual functionalities, decreasing the visual pollution.
Another important observation concerns the smartphones' setup interference in the users' performance.Furthermore, it was proved that the correlation of different kinds of information are important for the conclusion of the results, as seen in the relationship between the errors rate and low purchasing power people.
As prospects for the future, it is intended to incorporate other sensors to the UXEProject infrastructure, aiming to conduct new investigations involving different contextual factors.

Figure 6 .
Figure 6.Example of ESM form used in the experiments.

Figure 9 .
Figure 9. Mileage Application Interface and instrumented tasks to be evaluated

Figure 10 .
Figure 10.^3 (Cubed) Application Interface and the instrumented tasks to be evaluated

Figure 11 .
Figure 11.Shuffle application interface and instrumented tasks to be evaluated

Figure 13 .
Figure 13.Error rate due to the movement speed

Figure 14 .
Figure 14.Tasks Execution Speed due to screen resolution (in seconds)

Figure 16 .
Figure 16.Relationship between the screen size and resolution and the errors percentage

Table 1 .
Works that investigate the usability of applications for smartphones

Table 2 .
Amount of times each usability attribute was

investigated Attributes Number of times investigated
One of the main aspects to be highlighted is that only 3

Table 5 .
Scale of values for environment data

Table 6 .
Error/failure rate due to the position of interaction more errors are found in the executed tasks.The difference between the extremes, that is, big screens with high resolution compared to small screens with low resolution, is 9.3% of tasks executed with errors.