Testing Software Using Swarm Intelligence: A Bee Colony Optimization Approach

Software testing is a critical activity in increasing our confidence of a system under test and improving its quality. The key idea for testing a software application is to minimize the number of faults found in the system. Software verification through testing is a crucial step in the application's development life cycle. This process can be regarded as expensive and laborious, and its automation is valuable. We propose a multi-objective search based test generation technique that is based on both functional and structural testing. Our Search Based Software Testing (SBST) technique is based on a bee colony optimization algorithm that integrates adaptive random testing from the functional side and condition/decision and multiple condition coverage from the structural side. The constructive approach that the bee colony algorithm uses for solution generation allows our SBST to address the limitations of previous approaches relying on fully random initial solutions and single objective evaluation. We perform extensive experimental testing to justify the effectiveness of our approach.


INTRODUCTION
It is estimated that software testing corresponds to 30% to 50% of a project's budget [4].Generating test cases is a very challenging problem because it is unfeasible to find a suite of test cases that fully evaluates a program, as the input domain of most programs is nearly infinite.Thus, human testers spend a lot of time striving to find a high quality set of test cases that will allow them to detect a large percentage of faults in a software system.Automated testing research attempts to find ways to make this process more efficient by automatically generating the test cases.
A lot of research has been done in the area of search based methods of automated test case generation.Some of the SBST research has focused on local search methods such as hill climbing, simulated annealing and tabu search.In [2] a hill climbing approach was used to generate test cases, while in [24] the authors experimented with the use of simulated annealing.In [7] a tabu search metaheuristic algorithm was used to generate tests for structured software.Another local search technique used for test case generation is the alternating variable method [13] [14].The main problem with the use of local search techniques is that because they only consider the neighborhood of a high quality solution they often get stuck in local optima.Advanced local search techniques like simulated annealing and tabu search address this shortcoming by either restarting the search with random values or temporarily accepting low quality solutions.
Due to the limitations of local search techniques, various research efforts have been dedicated to global search methods that are less prone of getting stuck in local optimal.One of the most popular techniques are evolutionary algorithms such as Genetic Algorithm (GA).Various work has been done using GA techniques, for example [12], [20] and [3] are some of the early works.One of the problems with global search methods like genetic algorithms when applied to test generation is that they tend to alter the solutions too much.Operators like crossover and mutation drastically change the solution and make it harder to improve in a more focused way that relates to aspects of the test cases.In addition to that, these changes might generate invalid or illegal input.
To overcome this problem many recent approaches have used a hybrid method that combines genetic algorithms with other approaches, for example a local search method.One such work is the augmentation of the GA approach with constraint-based testing [17].In addition, the authors in [8] use a memetic algorithm while in [21] an approach that combines genetic algorithms and tabu search is applied.Another limitation of genetic algorithms approaches for software testing is having a candidate solution represent a test case and the entire population represent a single test suite.This choice of representation strips the GA approach from one of its main features, which is working on multiple candidate solutions instead of one.In addition, the genetic operators like crossover and mutation tend to not be very suitable when dealing with test cases as the chromosomes.Also the use of randomness when selecting an initial population of solutions isn't appropriate when performing test case generation, it seems more appropriate to base it on a partial set of functional tests [1].
In our research we propose three main contributions.First, our implementation is based on a Bee Colony Optimization (BCO) algorithm, which is a very recent heuristic proposed in [16] and used successfully to solve many different problems [22], [6], [15].We believe BCO would work very well because of its use of a constructive approach to solution generation.A test suite can be generated adaptively using these techniques as opposed to generating complete random solutions as in GA.The second is encoding a suite of test cases as a candidate solution instead of an independent test case; some recent research seems to be following this approach too [8].We find this approach to be more natural and have a one to one correspondence to how software testers construct a test suite.We seldom see them directly think of the entire test suite.Instead, they constructively modify the test suite by adding to it more test cases in order to improve the test coverage.We believe that genetic algorithms due to the limitations mentioned above wouldn't be particularly feasible for this approach.Our third contribution lies in the objective function used.Previous work in search based testing focused on a single objective function, which is mainly decision coverage [10,11].Our fitness function is multi objective, where the focus is on achieving condition and decision coverage, while multi-condition coverage is considered as a bonus objective.
This paper is divided as follows: in Section 2 we give an introduction to BCO, in section 3 we describe our implementation using BCO for automated test case generation.In section 4 we perform some experiments on our approach using a known set of test case samples.Finally in section 5 we present our observations and conclusions.

BEE COLONY OPTIMIZATION (BCO)
Colonies of social insects such as ants and bees have highly organized behavior that enables them to work collectively to solve problems and thus perform much more efficiently than having each member working individually [22].This interaction and collective behavior of the decentralized agents or members of the colony constitutes swarm intelligence.In a bee colony an organized collaborative effort is used in order to find flowers that are potential food sources and exploit them, allowing bees to harvest nectar from different food sources separated by long distances.There are two groups of bees that are formed as part of this strategy, scouts and workers.The set of bees working as scouts are constantly searching the environment for new potentially promising nectar sources.Any scout that finds a promising source returns to the hive and communicates the information to its peers by performing a special dance called a waggle dance.Other bees in the colony that are initially idle will observe the waggle dances performed by the scout bees and become worker bees on one of the advertised food sources.The amount of worker bees that will join on exploiting a particular food source is directly proportional to the quality of that source.Only a small percentage of the bees in the colony assume the role of scouts at any given time leaving the majority of the workforce to be concentrated on exploiting nectar sources.As long as sources are still deemed profitable, the bees working on them will continue to advertise them thus optimizing the workforce in the colony to focus on the best areas.
When using bee colony optimization (BCO), the search space of all possible solutions is represented by the field the bees are exploring.Each possible partial solution is a point in the field which is basically a potential source of nectar.The quality of the source is determined by an objective function that evaluates how optimal the solution is in solving the problem.There are two stages in BCO, the forward phase and the backward phase.The forward phase represents the bees flying to search for food sources, every source found becomes a new partial solution that is being analyzed.During the backward phase bees return to the hive and scout bees use the waggle dance to advertise those partial solutions.At that stage, some bees will join the workforce on each of those partial solutions that show promise while other bees will assume the role of scouts and search for new sources.
The effectiveness of the bee algorithm relies on the use of both an intensification and diversification strategies when constructing solutions.While a large number of the bees are performing a local search by exploring existing solutions (intensification), a small number of bees make sure the algorithm doesn't get stuck in a local optimal by always exploring new areas of the search space (diversification).This combination of both a local search and a global search and the fact that solutions are being constructed dynamically rather than randomly generated is one of the most important features of BCO.

AUTOMATED TEST GENERATION USING BCO
In this section we describe the Bee Colony Optimization (BCO) approach to unit testing.Our testing strategy is a white-box one, where we rely on the source code to generate the test cases.Therefore the Control Flow Graph (CFG) of the function under test is used to determine the coverage criteria and to guide the testing process.Adaptive random testing, a black-box approach, is also integrated into BCO to diversify the input values.
Our proposed work brings various novelties to the current techniques used by evolutionary search methods.Firstly, we address some of the limitations of known approaches such as random initial solutions [1], fixed test suite size and evolving operators that alter solutions too much [1].We do that by encoding a population of test suites.The reason we chose to use BCO as opposed to other evolutionary algorithms is that BCO is inherently a constructive algorithm.While other approaches start with fully random solutions, BCO starts with an empty solution and constructs it gradually through an iterative process.
Another feature of our approach is that in contrast to previous works like [18] that constrain the input domain of test variables to make the search space more manageable, and [9] that limit the range of values for bloat control, our approach imposes no limitations in the input domain.A final innovation lies in how we evaluate the quality of the tests.We use a multi-objective fitness function, contrary to the prevalent use of a single objective function in most previous work.

Solution Encoding
The problem of automated test case generation is to come up with suite of tests that exercise the component under test.In this paper we target standalone functions, therefore the output of our BCO algorithm is a test suite that is suitable for efficiently evaluating the function under test.BCO is a population based algorithm where multiple solutions are constructed at the same time.Each solution stands for a full test suite.An individual test suite is a set of test cases, where each test case is a tuple of values that stands for the input variables of the function under test.We focus in this paper on integer values but the approach can be easily modified to handle other data types.
The constructive nature of the BCO algorithm supports our choice of encoding.Once a test case is added to the test suite, the test case will not change.This simplifies the calculation of the coverage criteria and makes it an incremental one.Other approaches, such as the GA based ones, have to recalculate the coverage criteria for the entire test suite every time the crossover and mutation operators are applied.This is time consuming and has a detrimental effect on the performance of the automated test generation process.This also addresses other limitations of recent evolutionary approaches.Operators that function on the test case level evolve and modify the solution excessively and impact negatively on the effectiveness of the search.Because these approaches encode solutions as a single test case, even a slight mutation of the solution brings too much alteration, the use of a test suite as a solution addresses these problems because operators alter an individual test case within the larger solution rather than all of it.
The BCO algorithm as the search engine and the full test suite solution encoding described above make a straightforward and natural way to automatic software test generation that parallels how testers think about the problem.Our approach works by building a test suite one test case at a time, In other words, we start with an empty test suite and then try to add new test cases, the test that is added in every iteration should be the most suitable and profitable for efficiently testing the System Under Test (SUT) and should be based on the current tests already added to the suite.The technique we use to generate the tests takes into consideration many criteria of coverage and is an integral part of our approach; we discuss the coverage criteria later in this section.

Fitness Function
When evaluating the quality of a given solution we are trying to determine how thorough is the set of tests composing that solution with respect to evaluating the SUT.This is where the fitness function plays a role; this function indicates how close a solution is to the optimal test suite.The fitness function is calculated while taking into consideration the following coverage criteria:

Condition/Decision
Coverage: Also called branch/condition coverage.The test cases should guarantee that both condition and decision coverage are satisfied: Branch Coverage: Also known as decision coverage.The test cases should guarantee that each branch in a control flow graph is exercised at least once.In other words, all decisions (whether they are simple or compound) should be evaluated to true and false.For example the entire condition (x>8 && x<50) should be exercised when it is true and when it is false.

Condition Coverage:
The test cases should guarantee that each simple condition in the program is evaluated as true and false at least once.For example the entire condition (x>8 && x<50) should be exercised when it is true and when it is false.For example to cover the condition (x>8 && x<50) we need to test x>8 and x<50 when it is true and false, where TF and FT combinations is enough.

Multiple Condition Coverage:
The test cases should guarantee that all true-false combinations of simple conditions in a compound predicate are evaluated at least once.In other words to cover (x>8 && x<50) (which is composed of two simple conditions) we need TT, TF, FT and FF combinations.This coverage criterion isn't a feature tested in most previous works.
We consider the minimum optimal solution to be a test suite that has a complete condition/decision with the least number of test cases.A higher fitness value than the minimal optimal solutions indicates better multi-condition coverage.The value of the fitness function for a test suite that is constructed by a single bee bi (the i th bee) is determined with the following formula: Where mc stands for the number of multiple conditions that were not covered, cc for the number of simple conditions, and cd stands for the amount of conditions and decisions that weren't covered.Since the value of the fitness is higher when the coverage is lower this fitness function should be used with a bee algorithm that seeks to minimize the fitness.
The DistCost function is used to determine how far the solution is from covering a compound condition (decision) in the program.This function helps in distinguishing test suites that are closer to covering a condition from solutions that are farther away.This function is applied to all compound conditions in the function under test and their total sum is added to the fitness value.The distance cost is evaluated in the following way: Where the value of dis(Ci) depends on the type of the compound predicate, and is computed as described on table 1 similar to the authors' work in [23].We decided to target the coverage criteria at the source code level instead of the byte code level such as the work done in [5,19].Byte code instructions have a simpler decision structure, where compound decision statements are transformed into simple nested decision statements.Our goal is to determine the capability of our algorithm on complex decision structures.

BCO Algorithm Process
Our bee colony algorithm follows the general format described in [16].In this approach we start with a total of n number of bees and a k number of recruiter bees among them, where k < n and n > 1.These two values are determined by tuning parameter that can be adjusted during the testing phase, we normally try to set the number of recruiter bees to 25% of the total number of bees.These tuning parameters are referred hereafter in this work as numBees and numRecruiters.The experimental testing and parameter tuning process is discussed in details in Section 4.
The algorithm keeps alternating between a forward phase and a backward phase.Every iteration of the process consists of those two phases and the algorithm continues iterating until a satisfaction criterion is fulfilled.Each bee is assigned exclusively to work in one specific solution at a time, many bees could be working on the same solution but no bee can be working in more than one solution at the same time.The solutions as described before each consist of a full test suite.Figure 1 illustrates the general process of the algorithm.
During the forward phase all bees work on their current solutions by generating a new test to be added to the test suite they are constructing.During every forward phase, each bee generates and adds only one test to its solution.The process to generate the tests will be explained later in this section.During the backward phase all bees return to the hive and among them a number equal to numRecruiters will be selected and those bees will advertise their solutions to try to convince other bees to follow them.When a bee follows another it basically abandons the solution it was working on and shifts its work power to the solution of the recruiter.The way the recruiters are selected among the bees depends on the quality of the solutions they are working on.The higher the quality of a solution, the higher the probability of becoming a recruiter.The quality of a given solution is determined by using the fitness function described in Section 3.2.Before determining which bees will be recruiters in the backward phase of a given iteration, all the fitness values of all solutions are first normalized.

Figure 1. The general process of our BCO algorithm
For each fitness value we compute the normalized fitness using the following formula: We also compute the fitness probability using the following formula: In eq. 3 we are normalizing the fitness function of a particular solution i over the bee with the maximum fitness value.We are then subtracting this value from one to turn the problem from a minimization to a maximization one, since the fitness function gives lower values to higher quality solutions as described before.In eq.4 we divide the normalized fitness of solution i over its sum and selecting based on that.After the normalization and calculating the fitness probability, the selection process continues as follows: The first recruiter is selected using elitism that is the bee that has the highest quality solution is always chosen first.The remaining recruiters are selected using a roulette wheel selection based on the fitness probabilities of their solutions.For all the non-recruiter bees we decide for each one whether they will be loyal or not.A loyal bee will continue working on its current solution during that iteration, while a non-loyal bee will follow one of the recruiter bees.For non-loyal bees the recruiter it is assigned to is picked randomly from the existing recruiters.To determine if a bee is loyal or not we compute a loyalty probability and compare it with a random value, if the loyalty probability of that bee is higher, then the bee is considered loyal otherwise it is considered non loyal.The loyalty probability of bee bi is computed using the following formula: Where u stands for the number of forward phases the algorithm went through while NormFmax stands for the maximum normalized fitness value among all bees.Parameter u is introduced to allow the bees to easily change the solution they are currently working on at early stages of the algorithm while making it harder in later stages.Figure 2 gives an example of the forward phase.Here we have four bees working on the solution.In the first forward phase, each bee selects one test case from the set of candidate test cases (8 possible test cases in this example).
The bee selects a test case based on the probability of this test case improving the bee's fitness function.Here we can see that both bee 1 and bee 2 select Candidate Test Case (CTC) one as their first test in their respective paths.On the other hand, the third and fourth bees select CTC5 and CTC7 respectively.After the backward phase, which is not shown in the figure, the next forward phase commences.Each bee will add again a single test case to its test suite (path).

Figure 2. The forward Phase of our BCO implementation
Figure 3 shows an example of the backward phase.After the second forward phase, the four bees return to the hive to evaluate their constructed paths (their partial solutions).The first step is to determine the recruiter bees.In this example we have a single recruiter, which is the fourth bee.The next step is to determine the loyalty of the rest of the bees.The first and second bees decide to be loyal to their paths while the third bee abandons its paths.The third bee in phase 3 will have the same two test cases as the fourth bee but will choose a new test case on its own.

Test Generation
As mentioned before, during every forward phase each independent bee will generate a new test and incorporate it to its current solution.When generating these new tests we integrate three main features: graph coverage, local search and randomness.We try to generate many different tests using different techniques including having a focus on branch and condition coverage, local search, pure randomness and adaptive techniques.The bee will then pick the best test among them using roulette wheel to be the actual test added in that phase.So basically we generate n tests cases, we then evaluate the fitness function of the whole test suite and for each test case check how much improvement that particular test case will do to the overall solution.We then use roulette wheel to select the most suitable test case to add to the bee's path.

Figure 3. The backward phase of our BCO implementation
For graph coverage, the way this is performed is we first generate one test case that targets a specific simple condition in the control flow graph.The test case generation here makes use of the concept of boundary value analysis.For each simple condition with a relational operator a test case is randomly generated by choosing between three options: on the boundary, above or below the boundary by a certain percentage.For example, for the simple condition x = 9, the input is generated to cause the variable x to be 9, 9+δ and 9-δ where δ is a value within 0 to 30% of the value of x, in this case 9.
Hence we will have one test case for each control flow condition.That is the particular test case will be partially random with the only requirement being that when that test is performed that particular condition in the program flow graph will either be covered or  δ covered.
Then we generate a number of test cases equal to the number of input variables that the STU has, these test cases are slight variations of the last test case in the bee's path (test suite).Those test cases are modified based on a probability constant by just adding a small change to the input variable that corresponds to that test case, since each test case is assigned to a specific input parameter.This basically amounts to a local search being performed with those test cases.
Then we also generate one completely random test case and an adaptive random test case [5].Adaptive random testing usually outperforms ordinary random testing since the input is more evenly distributed.The way this adaptive random test case is generated is the following: For a given bee we create a random test case that is the farthest test case from all cases used before by that bee.The way we do this is to first generate ten random test cases and for each case compare it and calculate the Euclidean distance with all the test cases used by the bee then choose the test case among the random cases that is farthest away.
The test generation step does not concentrate on one test strategy, whether it is black box or white box.We do not want our approach to mimic boundary value testing or random testing, we want the algorithm to select either of both as required.The bee will choose the test case type that best benefits it during a particular forward phase.Thus, showing adaptability and flexibility.

EVALUATION
The evaluation of search based testing approaches is a challenging process since there is no single agreement on the best evaluation strategy.It is usually very difficult to compare one approach to previous approaches since each approach has different constraints such as the range of values used, the component that is under test (whether it is a standalone function or a class), the number of test cases generated or the size of the test suite.In addition, most SBST approaches are randomized algorithms and therefore produce different results on multiple runs.On the other hand, comparison of SBST techniques that are white box strategies with black box testing strategies such as random testing does not really help since each strategy has its own merits.
We opted to evaluate our approach through the use of mutation testing.After all, the ultimate question is whether the generated test suite is adequate enough.We ran our algorithm until a 100% condition/decision coverage or after a fixed number of forward phases.The case study used here is the triangle program, which is a very popular example and the benchmark in software testing literature.The program receives the lengths of the three sides of a triangle as input.The value of each side is an integer value.The output of the function is a string that indicates whether the input does not form a triangle or the type of the triangle whether it is isosceles, equilateral or scalene.The code, which is based on [19], is shown in Figure 4 while Figure 5 shows its corresponding control flow diagram.The optimal test suite size for this problem is 5.
We generated 46 different mutants from the original triangle program.Twenty eight of them are first order mutants, where a single fault is introduced to the original program using one of the mutation operators.The rest of the mutants are high order, where each mutant contained two to three different mutations.An example of the generated mutants can be seen in Figure 5; the highlighted code shows two mutations.We then run our BCO algorithm several times and obtained thirteen different solutions.As described before each solution consists of a complete test suite.Table 2 shows a description for each of the test suites generated including the number of test cases comprising the solution, the range of the input values used, the number of bees, the number of recruiters and the code coverage.We tried to vary the values of the first 4 parameters to determine their effect on the quality of the test suite.

Figure 4. The triangle type program
We then evaluated each test suite solution against all the generated mutants to see if the set of tests can detect all the software defects.Table 3 shows the results of our experiment for all tests and all mutants.The letter Y denotes that the given test suite killed the mutant, while an N indicates that the mutant wasn't killed.It is interesting to notice that most mutants were killed by all the test suites.Higher order mutants, as expected, were more challenging than first order mutants.Seven of the mutants were not killed, where one of them (Figure 6) is an equivalent mutant.The rest of the live mutants are on the boundary type faults such as if (a > b) is mutated into if (a >= b); these types of faults are hard to kill.One final thing to notice is that although all test suites have the same coverage percentage some of them were unable to kill a mutant while others did.For example the 2 nd and 11 th test suite where unable to detect the 46 th mutant, which contains an on the boundary type fault.The rest of the test suites were capable of killing this mutant.

CONCLUSIONS
In this paper we used a Bee colony optimization technique in order to perform automated test generation.This is one of the first works that applies this kind of swarm intelligence technique to the problem of software testing.With regards to previous approaches, our work improves on them by not limiting the input domain, avoiding random initial solutions and by using a dynamic constructive approach, using test suites as solutions rather than independent tests and by taking into consideration multiple condition coverage.

Table 2. Summary of the test suites used Table 3. Mutation Results
We our approach by using the popular triangle type program and by generating a set of mutants from it.We then verified how well all the test suites generated by our approach were able to detect the defects on the mutated programs.The results looked highly promising and encouraging as almost 90% of the software defects were found.
In the future we plan to improve on our approach by experimenting with different selection techniques.In this paper we were using roulette wheel and plan to try alternatives like tournament selection.We will try to test our BCO approach on software systems that contain more

Figure 5 .
Figure 5. CFG for the triangle program String triangleType(int b, int a, int c) { String type; if (a > b) . ..Figure 6.An example of a mutant of the triangle program

Figure 6 .
An example of a mutant of the triangle program