Sometimes we are not only interested in fuzzing as many as possible diverse program inputs, but in deriving *specific* test inputs that achieve some objective, such as reaching specific statements in a program. When we have an idea of what we are looking for, then we can *search* for it. Search algorithms are at the core of computer science, but applying classic search algorithms like breadth or depth first search to search for tests is unrealistic, because these algorithms potentially require us to look at all possible inputs. However, domain-knowledge can be used to overcome this problem. For example, if we can estimate which of several program inputs is closer to the one we are looking for, then this information can guide us to reach the target quicker – this information is known as a *heuristic*. The way heuristics are applied systematically is captured in *meta-heuristic* search algorithms. The "meta" denotes that these algorithms are generic and can be instantiated differently to different problems. Meta-heuristics often take inspiration from processes observed in nature. For example, there are algorithms mimicking evolutionary processes, swarm intelligence, or chemical reactions. In general, they are much more efficient than exhaustive search approaches such that they can be applied to vast search spaces – search spaces as vast as the domain of program inputs are no problem for them.

**Prerequisites**

- You should know how code coverage works, e.g. from the chapter on coverage.

If we want to apply a meta-heuristic search algorithm to generate test data for a program, then we have to make several choices: First, we need to decide on what exactly our *search space* is in the first place. The search space is defined by how we *represent* what we are looking for. Are we looking for single integer values? Tuples of values? Objects? XML documents?

The representation is highly dependent on the particular testing problem we are solving --- we know which program we are testing, so the representation needs to encode whatever an input to our target program is. Let's consider the example function `test_me()`

as our function under test:

In [4]:

```
def test_me(x, y):
if x == 2 * (y + 1):
return True
else:
return False
```

`test_me()`

function has two input parameters, and returns `True`

or `False`

depending on how the two relate to each other. A test input to `test_me()`

consists of a pair of values, one for `x`

and one for `y`

. For example:

In [5]:

```
test_me(0, 0)
```

Out[5]:

False

In [6]:

```
test_me(4, 2)
```

Out[6]:

False

In [7]:

```
test_me(22, 10)
```

Out[7]:

True

Our search space is only concerned with inputs, thus a simple representation for test data would be input tuples `(x, y)`

. Each point in this input space has eight *neighbors*:

`x-1, y-1`

`x-1, y`

`x-1, y+1`

`x, y+1`

`x+1, y+1`

`x+1, y`

`x+1, y-1`

`x, y-1`

To keep things simple, let's restrict the size of our search space to start with (we will change this later). For example, let's assume we only want values in the range of -1000 to 1000:

In [8]:

```
MAX = 1000
MIN = -MAX
```

`neighbors()`

, which implements a basic Moore neighborhood. That is, we look at all 8 immediate neighbors, while considering the boundaries we just defined with `MAX`

and `MIN`

:

In [9]:

```
def neighbors(x, y):
return [(x + dx, y + dy) for dx in [-1, 0, 1]
for dy in [-1, 0, 1]
if (dx != 0 or dy != 0)
and ((MIN <= x + dx <= MAX)
and (MIN <= y + dy <= MAX))]
```

In [10]:

```
print(neighbors(10, 10))
```

[(9, 9), (9, 10), (9, 11), (10, 9), (10, 11), (11, 9), (11, 10), (11, 11)]

All meta-heuristics are based on the use of a heuristic function that estimates how good a given candidate solution is; this "goodness" is typically called the *fitness* of an individual, and the heuristic that estimates the fitness is the *fitness function*. The fitness function is a function that maps any point in the search space to a numerical value, the fitness value. The better a candidate solution in the search space with respect to being an optimal solution, the better its fitness value. Thus, if you plot each point in the search space with its fitness value as the height, you get a landscape with the optimal solution represented as the highest peak.

The fitness function depends on the objective one would like to achieve with generating the test data. Suppose that we are interested in covering the true branch of the if-condition in the `test_me()`

function, i.e. `x == 2 * (y + 1)`

.

How close is a given input tuple for this function from reaching the target branch? Let's consider an arbitrary point in the search space, e.g. `(274, 153)`

. The if-condition compares the following values:

In [11]:

```
x = 274
y = 153
x, 2 * (y + 1)
```

Out[11]:

(274, 308)

`x`

and `2 * (y + 1)`

. Thus, we can calculate this distance as `abs(x - 2 * (y + 1))`

:

In [12]:

```
def calculate_distance(x, y):
return abs(x - 2 * (y + 1))
```

In [13]:

```
calculate_distance(274, 153)
```

Out[13]:

34

We can use this distance value as our fitness function, since we can nicely measure how close we are to an optimal solution. Note, however, that "better" doesn't mean "bigger" in this case; the smaller the distance the better. This is not a problem, since any algorithm that can maximize a value can also be made to minimize it instead.

For each value in the search space of integer tuples, this distance value defines the elevation in our search landscape. Since our example search space is two-dimensional, the search landscape is three-dimensional and we can plot it to see what it looks like:

In [16]:

```
%matplotlib inline
xx = np.outer(np.linspace(-10, 10, 30), np.ones(30))
yy = xx.copy().T
zz = calculate_distance(xx, yy)
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot_surface(xx, yy, zz, cmap=plt.cm.jet, rstride=1, cstride=1, linewidth=0);
```

The fitness function should calculate the distance value for concrete test executions. That is, we want to run the program, and then learn the distance value of this execution. However, the branching condition is hidden within the source code of the target function, and its value may in principle be the result of various calculations along the execution path reaching it. Even though in our example the condition is an equation using the input values of the function directly, this may not be the case in general; it might as well be derived values. Thus, the values we need to calculate our distance metric need to be observed directly at the conditional statement.

This is typically done with *instrumentation*: We add new code immediately before or after the branching condition to keep track of the values observed and calculate the distance using these values. The following is an instrumented version of our program under test, which prints out the distance values as it is executed:

In [17]:

```
def test_me_instrumented(x, y):
print("Instrumentation: Input = (%d, %d), distance = %d" %
(x, y, calculate_distance(x, y)))
if x == 2 * (y + 1):
return True
else:
return False
```

Let's try this out for a couple of example values:

In [18]:

```
test_me_instrumented(0, 0)
```

Instrumentation: Input = (0, 0), distance = 2

Out[18]:

False

In [19]:

```
test_me_instrumented(5, 2)
```

Instrumentation: Input = (5, 2), distance = 1

Out[19]:

False

In [20]:

```
test_me_instrumented(22, 10)
```

Instrumentation: Input = (22, 10), distance = 0

Out[20]:

True

In [21]:

```
distance = 0
```

In [22]:

```
def test_me_instrumented(x, y): # type: ignore
global distance
distance = calculate_distance(x, y)
if x == 2 * (y + 1):
return True
else:
return False
```

`test_me()`

, we can now finally define our fitness function, which simply runs the instrumented `test_me_instrumented()`

function, and then retrieves the value of the global `distance`

variable:

In [23]:

```
def get_fitness(x, y):
global distance
test_me_instrumented(x, y)
fitness = distance
return fitness
```

Let's try this on some example inputs:

In [24]:

```
get_fitness(0, 0)
```

Out[24]:

2

In [25]:

```
get_fitness(1, 2)
```

Out[25]:

5

In [26]:

```
get_fitness(22, 10)
```

Out[26]:

0

In [ ]:

```
```

Having decided on a representation (2-tuples of integers) and a fitness function (distance to target branch), we can now finally go ahead and implement our search algorithm. Let's explore this search space using the simplest possible meta-heuristic algorithm: Hillclimbing. The metaphor captures aptly what is happening: The algorithm tries to climb a hill in the search space defined by our representation. Except, that in our search landscape the best values are not those high up but down low, so technically we are descending into valleys.

The hillclimbing algorithm itself is very simple:

- Take a random starting point
- Determine fitness value of all neighbors
- Move to neighbor with the best fitness value
- If solution is not found, continue with step 2

`x`

and `y`

. For any pair of random integer numbers, the chances of them satisfying the condition `x == 2 * (y + 1)`

are rather slim. Suppose the random values are `(274, 153)`

. The right-hand side of the equation, `2 * (y + 1)`

, evaluates to 308, so the condition is clearly false. Where should the hillclimber go to now? Let's look at the fitness values of this test input and its neighbors:

In [27]:

```
x, y = 274, 153
print("Origin %d, %d has fitness %d" % (x, y, get_fitness(x, y)))
for nx, ny in neighbors(x, y):
print("neighbor %d, %d has fitness %d" % (nx, ny, get_fitness(nx, ny)))
```

`y`

by one increases the value of the right-hand side of the equation to `310`

. Thus, the value on the left-hand side of the equation thus differs *even more* to the value on the right-hand side of the equation than it did before the increase! So, increasing `y`

does not seem like a good idea. On the other hand, increasing `x`

by one improves things: The left-hand side and the right-hand side of the equation become more similar; they are "less unequal". Thus, out of the eight possible neighbors of `(274, 153)`

, the neighbor that increases `x`

and decreases `y`

(`(275, 152)`

) seems best intuitively --- the outcome of the condition is still false, but it is "less so" than for the original value.

Let's now implement the hillcimbing algorithm.

In [29]:

```
LOG_VALUES = 20 # Number of values to log
```

In [30]:

```
def hillclimber():
# Create and evaluate starting point
x, y = random.randint(MIN, MAX), random.randint(MIN, MAX)
fitness = get_fitness(x, y)
print("Initial value: %d, %d at fitness %.4f" % (x, y, fitness))
iterations = 0
logs = 0
# Stop once we have found an optimal solution
while fitness > 0:
iterations += 1
# Move to first neighbor with a better fitness
for (nextx, nexty) in neighbors(x, y):
new_fitness = get_fitness(nextx, nexty)
# Smaller fitness values are better
if new_fitness < fitness:
x, y = nextx, nexty
fitness = new_fitness
if logs < LOG_VALUES:
print("New value: %d, %d at fitness %.4f" % (x, y, fitness))
elif logs == LOG_VALUES:
print("...")
logs += 1
break
print("Found optimum after %d iterations at %d, %d" % (iterations, x, y))
```

In [31]:

```
hillclimber()
```

`x`

and `y`

. We use low values in the range of `-1000`

--`1000`

(which we defined `MIN`

and `MAX`

to be earlier) to reduce the time search takes when playing with the example. Then, we determine the fitness value of this starting point by calling `get_fitness()`

. Recall that we are trying to find the smallest possible fitness value, therefore we now loop until we have found a fitness value of `0`

(i.e., an optimal value).

`neighbors`

), and evaluate the fitness value of each of the neighbors. As soon as we have found a neighbor with better (smaller) fitness, the hillclimber exits the loop and uses this as the new starting point. An alternative variant of this simple hillclimbing algorithm would be to remove the `break`

statement: By doing so, *all* neighbors would be evaluated, and the best neighbor would be chosen. This is known as *steepest ascent hillclimbing*. You will see that the number of iterations necessary to reach the optimum is lower, although for each iteration more tests are executed.

In [32]:

```
def steepest_ascent_hillclimber():
# Create and evaluate starting point
x, y = random.randint(MIN, MAX), random.randint(MIN, MAX)
fitness = get_fitness(x, y)
print("Initial value: %d, %d at fitness %.4f" % (x, y, fitness))
iterations = 0
logs = 0
# Stop once we have found an optimal solution
while fitness > 0:
iterations += 1
# Move to first neighbor with a better fitness
for (nextx, nexty) in neighbors(x, y):
new_fitness = get_fitness(nextx, nexty)
if new_fitness < fitness:
x, y = nextx, nexty
fitness = new_fitness
if logs < LOG_VALUES:
print("New value: %d, %d at fitness %.4f" % (x, y, fitness))
elif logs == LOG_VALUES:
print("...")
logs += 1
print("Found optimum after %d iterations at %d, %d" % (iterations, x, y))
```

In [33]:

```
steepest_ascent_hillclimber()
```

In [34]:

```
def plotting_hillclimber(fitness_function):
data = []
# Create and evaluate starting point
x, y = random.randint(MIN, MAX), random.randint(MIN, MAX)
fitness = fitness_function(x, y)
data += [fitness]
iterations = 0
# Stop once we have found an optimal solution
while fitness > 0:
iterations += 1
# Move to first neighbor with a better fitness
for (nextx, nexty) in neighbors(x, y):
new_fitness = fitness_function(nextx, nexty)
if new_fitness < fitness:
x, y = nextx, nexty
fitness = new_fitness
data += [fitness]
break
print("Found optimum after %d iterations at %d, %d" % (iterations, x, y))
return data
```

In [35]:

```
data = plotting_hillclimber(get_fitness)
```

Found optimum after 429 iterations at -1000, -501

In [36]:

```
fig = plt.figure()
ax = plt.axes()
xs = range(len(data))
ax.plot(xs, data);
```

*none* of the neighbors actually has a better fitness value? Consider the following function `test_me2`

:

In [37]:

```
def test_me2(x, y):
if(x * x == y * y * (x % 20)):
return True
else:
return False
```

`test_me2`

again, then we can calculate the distance in the same way as previously, i.e., by calculating the difference between the two sides of the comparison. Let's instrument the `test_me2()`

function:

In [38]:

```
def test_me2_instrumented(x, y):
global distance
distance = abs(x * x - y * y * (x % 20))
if(x * x == y * y * (x % 20)):
return True
else:
return False
```

`distance`

variable.

In [39]:

```
def bad_fitness(x, y):
global distance
test_me2_instrumented(x, y)
fitness = distance
return fitness
```

In [40]:

```
xx = np.outer(np.linspace(-10, 10, 30), np.ones(30))
yy = xx.copy().T
zz = abs(xx * xx - yy * yy * (xx % 20)) # type: ignore
```

In [41]:

```
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot_surface(xx, yy, zz, cmap=plt.cm.jet, rstride=1, cstride=1, linewidth=0); # type: ignore
```

*local optimum*. If one reaches such a point, the easiest way out is to give up and restart from a new random point. This is what we will do in our hillclimber with *random restarts*.

In [42]:

```
def restarting_hillclimber(fitness_function):
data = []
# Create and evaluate starting point
x, y = random.randint(MIN, MAX), random.randint(MIN, MAX)
fitness = fitness_function(x, y)
data += [fitness]
print("Initial value: %d, %d at fitness %.4f" % (x, y, fitness))
iterations = 0
# Stop once we have found an optimal solution
while fitness > 0:
changed = False
iterations += 1
# Move to first neighbor with a better fitness
for (nextx, nexty) in neighbors(x, y):
new_fitness = fitness_function(nextx, nexty)
if new_fitness < fitness:
x, y = nextx, nexty
fitness = new_fitness
data += [fitness]
changed = True
break
if not changed:
x, y = random.randint(MIN, MAX), random.randint(MIN, MAX)
fitness = fitness_function(x, y)
data += [fitness]
print("Found optimum after %d iterations at %d, %d" % (iterations, x, y))
return data
```

`bad_fitness`

and plot the resulting fitness values we observe:

In [43]:

```
MAX = 1000
MIN = -MAX
```

In [44]:

```
data = restarting_hillclimber(bad_fitness)
```

Initial value: 333, 231 at fitness 582804.0000 Found optimum after 165 iterations at 521, 521

In [46]:

```
fig = plt.figure()
ax = plt.axes()
xs = range(len(data))
ax.plot(xs, data);
```

`x`

and `y`

to rather small range of `[MIN, MAX]`

. This is a common trick in test generation, as in *most* cases solutions tend to consist of small values, and using small values to start the search makes the search quicker in many cases. However, what if the solution we need is at a completely different place in our search space? Our bias towards smaller solutions would mean that the hillclimber would take very long to find the solution, and given a fixed search budget it would thus be less likely to actually find a solution. To see what effects this would have, we could simply replace the `1000`

with, say, `1000000`

or more. We can play around with the range to see the performance we get for our simple search problems.

In [47]:

```
MAX = 100000
MIN = -MAX
```

In [49]:

```
with Timer() as t:
restarting_hillclimber(get_fitness)
print("Search time: %.2fs" % t.elapsed_time())
```

Let's turn to a slightly more complex program: The CGI decoder you already know from the Coverage chapter.

In [50]:

```
def cgi_decode(s):
"""Decode the CGI-encoded string `s`:
* replace "+" by " "
* replace "%xx" by the character with hex number xx.
Return the decoded string. Raise `ValueError` for invalid inputs."""
# Mapping of hex digits to their integer values
hex_values = {
'0': 0, '1': 1, '2': 2, '3': 3, '4': 4,
'5': 5, '6': 6, '7': 7, '8': 8, '9': 9,
'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,
'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,
}
t = ""
i = 0
while i < len(s):
c = s[i]
if c == '+':
t += ' '
elif c == '%':
digit_high, digit_low = s[i + 1], s[i + 2]
i += 2
if digit_high in hex_values and digit_low in hex_values:
v = hex_values[digit_high] * 16 + hex_values[digit_low]
t += chr(v)
else:
raise ValueError("Invalid encoding")
else:
t += c
i += 1
return t
```

The `cgi_decode()`

function has one input of type string, and one possible way to define the neighborhood of a string is by all possible strings that have an edit distance of 1. For example, string `test`

would have two neighbors for each of its four characters:

`uest`

`tfst`

`tett`

`tesu`

`sest`

`tdst`

`tert`

`tess`

In addition, prepending any character or appending any character would also have an edit distance of 1 and could be considered neighbors. To keep things simple, let's keep the length of our input strings fixed to a reasonable value (e.g. 10). In this case, each individual has 20 neighbors (i.e., each character has two neighbors).

Let's implement a new `neighbor_strings()`

function that produces these neighboring strings:

In [51]:

```
def neighbor_strings(x):
n = []
for pos in range(len(x)):
c = ord(x[pos])
if c < 126:
n += [x[:pos] + chr(c + 1) + x[pos + 1:]]
if c > 32:
n += [x[:pos] + chr(c - 1) + x[pos + 1:]]
return n
```

`neighbor_strings()`

function gets the numerical value of each character in the input string, and creates a new string with the character replaced with the preceding and succeeding characters in the alphabet. To start with, we only consider printable ASCII characters, which are in the range 32-126.

In [52]:

```
print(neighbor_strings("Hello"))
```

['Iello', 'Gello', 'Hfllo', 'Hdllo', 'Hemlo', 'Heklo', 'Helmo', 'Helko', 'Hellp', 'Helln']

`cgi_decode()`

function. The next ingredient to searching for individuals in this search space is a fitness function.

The `test_me()`

function consisted of a single if-condition, in which two integer numbers were compared. In the `cgi_decode()`

function we have three if-conditions and one while loop, and they all compare characters. Fortunately, as we have already seen, we can treat characters like numbers, so we can use the same distance estimate we used in the `test_me()`

example. However, there are also two conditions which check whether a character is contained in a set, e.g. `digit_high in hex_values`

. How close is a value to being contained in the set? An obvious solution would be to consider the distance to the closest value in the set as the estimate.

In [54]:

```
def distance_character(target, values):
# Initialize with very large value so that any comparison is better
minimum = sys.maxsize
for elem in values:
distance = abs(target - elem)
if distance < minimum:
minimum = distance
return minimum
```

In [55]:

```
distance_character(10, [1, 5, 12, 100])
```

Out[55]:

2

In [56]:

```
distance_character(10, [0, 50, 80, 200])
```

Out[56]:

10

`a == b`

, if `a`

is smaller than `b`

, then the false distance is `0`

by definition.

`a == b`

being false when `a`

equals `b`

? Any change to either `a`

or `b`

would make the condition evaluate to false, so we can define the distance simply as 1 in this case.

More generally, there can be other types of comparisons, for example using relational operators. Consider the loop condition in `cgi_decode()`

: `i < len(s)`

, i.e., it uses a less-than comparison operator. It is quite straight forward to extend our notion of branch distance to cover different types of comparisons, and to calculate true and false distances. The following table shows how to calculate the distance for different types of comparisons:

Condition | Distance True | Distance False |
---|---|---|

a == b | abs(a - b) | 1 |

a != b | 1 | abs(a - b) |

a < b | b - a + 1 | a - b |

a <= b | b - a | a - b + 1 |

a > b | a - b + 1 | b - a |

Note that several of the calculations add a constant `1`

. The reason for this is quite simple: Suppose we want to have `a < b`

evaluate to true, and let `a = 27`

and `b = 27`

. The condition is not true, but simply taking the difference would give us a result of `0`

. To avoid this, we have to add a constant value. It is not important whether this value is `1`

-- any positive constant works.

In the `cgi_decode()`

function, we can also find a somewhat more complex predicate which consists of two conditions joined by a logical `and`

:

`if digit_high in hex_values and digit_low in hex_values:`

In principle, the branch distance is defined such that the distance to make a conjunction `A and B`

true equals the sum of the branch distances for `A`

and `B`

, as both of the two conditions would need to be true. Similarly, the branch distance to make `A or B`

true would be the minimum of the two branch distances of `A`

and `B`

, as it suffices if one of the two conditions is true to make the entire expression true.

However, it is not as easy as that in practice: Predicates can consist of nested conditions and negations, and one would need to convert the expression to canonical form before being able to apply this calculation. Furthermore, most modern programming languages use *short-circuit evaluation*: If there is a condition `A or B`

, and `A`

is true, then `B`

is never evaluated. If `B`

is an expression with side effects, then by calculating the branch distance of `B`

even though short-circuit evaluation would avoid its execution, we would potentially be changing the program behavior (by invoking the side-effect that would in normal behavior not be executed), and that is not acceptable.

Furthermore, what if the branching condition has side effects? For example, suppose that the branching condition were `x == 2 * foo(y)`

, where `foo()`

is a function that takes an integer as input. Naively instrumenting would lead to the following code:

```
distance = abs(x - 2 * foo(y))
if x == 2 * foo(y):
...
```

Thus, the instrumentation would lead to `foo()`

being executed *twice*. Suppose `foo()`

changes the state of the system (e.g., by printing something, accessing the file system, changing some state variables, etc.), then clearly invoking `foo()`

a second time is a bad idea. One way to overcome this problem is to *transform* the conditions, rather than *adding* tracing calls. For example, one can create temporary variables that hold the values necessary for the distance calculation and then use these in the branching condition:

```
tmp1 = x
tmp2 = 2 * foo(y)
distance = abs(tmp1 - tmp2)
if tmp1 == tmp2:
...
```

`cgi_decode()`

function. Rather, we should be looking at how to *automatically* instrument programs to contain the necessary added statements such that we can calculate fitness values.

An alternative approach to using the global and temporary variables is to replace the actual comparison with a call to a helper function, where the original expressions are evaluated as arguments, and the operator is an additional argument. Assume we have a function `evaluate_condition()`

which takes four parameters:

`num`

is a unique id that identifies the condition;`op`

is the operator of the comparison;`lhs`

and`rhs`

are the operands.

The function calculates two distances for the condition: The distance to the condition evaluating to true, and the distance to the condition evaluating to false. One of the two outcomes will always be true, and thus one of them will always have distance `0`

. Since the function replaces the original comparison, it returns true or false, depending on which distance is `0`

. That means, the example expression

```
if x == 2 * foo(y)
```

would be replaced by

```
if evaluate_condition(0, "Eq", x, 2 * foo(y))
```

such that the arguments are only evaluated once, and side effects are thus handled correctly. Here is how the `evaluate_condition()`

function looks like:

In [57]:

```
def evaluate_condition(num, op, lhs, rhs):
distance_true = 0
distance_false = 0
if op == "Eq":
if lhs == rhs:
distance_false = 1
else:
distance_true = abs(lhs - rhs)
# ... code for other types of conditions
if distance_true == 0:
return True
else:
return False
```

`distance_true`

and `distance_false`

with `0`

. Thus, if `lhs`

equals `rhs`

, then the variable `distance_true`

remains 0, and vice versa.

In [58]:

```
evaluate_condition(1, "Eq", 10, 20)
```

Out[58]:

False

In [59]:

```
evaluate_condition(2, "Eq", 20, 20)
```

Out[59]:

True

`evaluate_condition()`

function does not yet do is store the distances observed. Obviously, we will need to store the values somewhere so that we can access it from our fitness function. Since the `cgi_decode()`

program consists of several conditions, and for each condition we might be interested in the true and the false distance, we simply use two global dictionaries, `distances_true`

and `distances_false`

, and define a helper function that stores the distance values observed in the dictionary:

In [60]:

```
def update_maps(condition_num, d_true, d_false):
global distances_true, distances_false
if condition_num in distances_true.keys():
distances_true[condition_num] = min(
distances_true[condition_num], d_true)
else:
distances_true[condition_num] = d_true
if condition_num in distances_false.keys():
distances_false[condition_num] = min(
distances_false[condition_num], d_false)
else:
distances_false[condition_num] = d_false
```

`condition_num`

is the unique ID of the condition that we've just evaluated. If this is the first time that we have executed this particular condition, then the true and false distances are simply stored in the corresponding dictionaries. However, it is possible that the same test executes a condition multiple times. For example, the loop condition `i < len(s)`

in the `cgi_decode()`

function is evaluated before every single loop iteration. In the end, however, we want to have a single fitness value for a test. As covering a branch just requires that at least one of the executions reaches the branch, we consider only the closest one. Therefore, if the `distances_true`

and `distances_false`

dictionaries already contain the distance from a previous execution, we only replace that value if the new execution was closer to reaching the branch; this is implemented using the `min()`

function.

`evaluate_condition()`

. Let's also add the calculation of distance for the `in`

operator and the `<`

comparison, since we need both of them for the `cgi_decode()`

example. Furthermore, `cgi_decode()`

actually compares characters and numbers. To make sure we use the correct types, we first have to convert the characters to numbers to calculate the distances. This is done using Python's `ord()`

function.

In [61]:

```
def evaluate_condition(num, op, lhs, rhs): # type: ignore
distance_true = 0
distance_false = 0
# Make sure the distance can be calculated on number and character
# comparisons
if isinstance(lhs, str):
lhs = ord(lhs)
if isinstance(rhs, str):
rhs = ord(rhs)
if op == "Eq":
if lhs == rhs:
distance_false = 1
else:
distance_true = abs(lhs - rhs)
elif op == "Lt":
if lhs < rhs:
distance_false = rhs - lhs
else:
distance_true = lhs - rhs + 1
# ...
# handle other comparison operators
# ...
elif op == "In":
minimum = sys.maxsize
for elem in rhs.keys():
distance = abs(lhs - ord(elem))
if distance < minimum:
minimum = distance
distance_true = minimum
if distance_true == 0:
distance_false = 1
update_maps(num, distance_true, distance_false)
if distance_true == 0:
return True
else:
return False
```

The following shows the instrumentation of the conjunction from `cgi_decode()`

to make use of the `evaluate_condition()`

function. There are two calls to `evaluate_condition`

corresponding to the two conditions, and the operator `and`

with which they are conjoined ensures that the original short-circuiting behavior is preserved:

`if (evaluate_condition(4, 'In', digit_high, hex_values) and evaluate_condition(5, 'In', digit_low, hex_values))`

Of course we would like to automatically produce this instrumented version.

Replacing comparisons automatically is actually quite easy in Python, using the abstract syntax tree (AST) of the program. In the AST, a comparison will typically be a tree node with an operator attribute and two children for the left-hand and right-hand operators. To replace such comparisons with a call to `evaluate_condition()`

one simply needs to replace the comparison node in the AST with a function call node, and this is what the `BranchTransformer`

class does use a `NodeTransformer`

from Python's `ast`

module:

In [63]:

```
class BranchTransformer(ast.NodeTransformer):
branch_num = 0
def visit_FunctionDef(self, node):
node.name = node.name + "_instrumented"
return self.generic_visit(node)
def visit_Compare(self, node):
if node.ops[0] in [ast.Is, ast.IsNot, ast.In, ast.NotIn]:
return node
self.branch_num += 1
return ast.Call(func=ast.Name("evaluate_condition", ast.Load()),
args=[ast.Num(self.branch_num),
ast.Str(node.ops[0].__class__.__name__),
node.left,
node.comparators[0]],
keywords=[],
starargs=None,
kwargs=None)
```

The `BranchTransformer`

parses a target Python program using the built-in parser `ast.parse()`

, which returns the AST. Python provides an API to traverse and modify this AST. To replace the comparison with a function call we use an `ast.NodeTransformer`

, which uses the visitor pattern where there is one `visit_*`

function for each type of node in the AST. As we are interested in replacing comparisons, we override `visit_Compare`

, where instead of the original comparison node we return a new node of type `ast.Func`

, which is a function call node. The first parameter of this node is the name of the function `evaluate_condition()`

, and the arguments are the four arguments that our `evaluate_condition()`

function expects:

- the number of branch (for which we keep a count in
`branch_num`

), - the operator (for which we simply use the class name),
- the left-hand side, and
- the right-hand side.

Note that Python allows comparisons of multiple expressions (e.g. `1 < x < 10`

); to keep the code simple we only deal with individual comparisons here, but it would be straight forward to extend the code by treating each comparison with an individual call to `evaluate_condition`

. You will notice that we also override `visit_FunctionDef`

; this is just to change the name of the method by appending `_instrumented`

, so that we can continue to use the original function together with the instrumented one.

`cgi_decode()`

function to an AST, then transforms it, and prints it out again (using the `unparse()`

function from the `ast`

library):

In [66]:

```
source = inspect.getsource(cgi_decode)
node = ast.parse(source)
BranchTransformer().visit(node)
# Make sure the line numbers are ok before printing
node = ast.fix_missing_locations(node)
print_content(ast.unparse(node), '.py')
```

`compile()`

function. We then need to make the compiled function accessible, for which we first retrieve the current module from `sys.modules`

, and then add the compiled code of the instrumented function to the list of functions of the current module using `exec`

. After this, the `cgi_decode_instrumented()`

function can be accessed.

In [68]:

```
def create_instrumented_function(f):
source = inspect.getsource(f)
node = ast.parse(source)
node = BranchTransformer().visit(node)
# Make sure the line numbers are ok so that it compiles
node = ast.fix_missing_locations(node)
# Compile and add the instrumented function to the current module
current_module = sys.modules[__name__]
code = compile(cast(ast.Module, node), filename="<ast>", mode="exec")
exec(code, current_module.__dict__)
```

In [69]:

```
# Set up the global maps
distances_true: Dict[int, int] = {}
distances_false: Dict[int, int] = {}
```

In [70]:

```
# ignore
def cgi_decode_instrumented(s: str) -> str:
return "" # make mypy happy
```

In [71]: