Concolic Fuzzing¶

In the chapter on information flow, we have seen how one can use dynamic taints to produce more intelligent test cases than simply looking for program crashes. We have also seen how one can use the taints to update the grammar, and hence focus more on the dangerous methods.

While taints are helpful, uninterpreted strings is only one of the attack vectors. Can we say anything more about the properties of variables at any point in the execution? For example, can we say for sure that a function will always receive the buffers with the correct length?

Concolic execution offers a solution here. The idea of concolic execution over a function is as follows: We start with a sample input for the function, and execute the function under trace. At each point the execution passes through a conditional, we save the conditional encountered in the form of relations between symbolic variables. Here, a symbolic variable can be thought of as a sort of placeholder for the real variable, sort of like the x in solving for x in Algebra. The symbolic variables can be used to specify relations without actually solving them.

With concolic execution, one can collect the constraints that an execution path encounters, and use it to answer questions about the program behavior at any point we prefer along the program execution path. We can further use concolic execution to enhance fuzzing.

In this chapter, we explore in depth how to execute a Python function concolically, and how concolic execution can be used to enhance fuzzing.

Prerequisites

  • You should have read the chapter on coverage.
  • You should have read the chapter on information flow.
  • A familiarity with the basic idea of SMT solvers would be useful.

Tracking Constraints¶

In the chapter on information flow, we have seen how dynamic taints can be used to direct fuzzing by indicating which part of input reached interesting places. However, dynamic taint tracking is limited in the information that it can propagate. For example, we might want to explore what happens when certain properties of the input changes.

For example, say we have a function factorial() that returns the factorial value of its input.

In [4]:
def factorial(n):
    if n < 0:
        return None

    if n == 0:
        return 1

    if n == 1:
        return 1

    v = 1
    while n != 0:
        v = v * n
        n = n - 1

    return v

We exercise the function with a value of 5.

In [5]:
factorial(5)
Out[5]:
120

Is this sufficient to explore all the features of the function? How do we know? One way to verify that we have explored all features is to look at the coverage obtained. First we need to extend the Coverage class from the chapter on coverage to provide us with coverage arcs.

In [8]:
class ArcCoverage(Coverage):
    def traceit(self, frame, event, args):
        if event != 'return':
            f = inspect.getframeinfo(frame)
            self._trace.append((f.function, f.lineno))
        return self.traceit

    def arcs(self):
        t = [i for f, i in self._trace]
        return list(zip(t, t[1:]))

Next, we use the Tracer to obtain the coverage arcs.

In [9]:
with ArcCoverage() as cov:
    factorial(5)

We can now use the coverage arcs to visualize the coverage obtained.

In [11]:
to_graph(gen_cfg(inspect.getsource(factorial)), arcs=cov.arcs())
Out[11]:
1 1: enter: factorial(n) 3 2: if: n < 0 1->3 2 1: exit: factorial(n) 4 3: return None 4->2 6 6: return 1 6->2 8 9: return 1 8->2 13 16: return v 13->2 3->4 5 5: if: n == 0 3->5 5->6 7 8: if: n == 1 5->7 7->8 9 11: v = 1 7->9 10 12: while: n != 0 9->10 10->13 11 13: v = v * n 10->11 12 14: n = n - 1 12->10 11->12

We see that the path [1, 2, 5, 8, 11, 12, 13, 14] is covered (green) but sub-paths such as [2, 3], [5, 6] and [8, 9] are unexplored (red). What we need is the ability to generate inputs such that the True branch is taken at 2. How do we do that?

Concolic Execution¶

One way to cover additional branches is to look at the execution path being taken, and collect the conditional constraints that the path encounters. Then we can try to produce inputs that lead us to taking the non-traversed path.

First, let us step through the function.

In [12]:
lines = [i[1] for i in cov._trace if i[0] == 'factorial']
src = {i + 1: s for i, s in enumerate(
    inspect.getsource(factorial).split('\n'))}
  • The line (1) is simply the entry point of the function. We know that the input is n, which is an integer.
In [13]:
src[1]
Out[13]:
'def factorial(n):'
  • The line (2) is a predicate n < 0. Since the next line taken is line (5), we know that at this point in the execution path, the predicate was false.
In [14]:
src[2], src[3], src[4], src[5]
Out[14]:
('    if n < 0:', '        return None', '', '    if n == 0:')

We notice that this is one of the predicates where the true branch was not taken. How do we generate a value that takes the true branch here? One way is to use symbolic variables to represent the input, encode the constraint, and use an SMT Solver to solve the negation of the constraint.

As we mentioned in the introduction to the chapter, a symbolic variable can be thought of as a sort of placeholder for the real variable, sort of like the x in solving for x in Algebra. These variables can be used to encode constraints placed on the variables in the program. We identify what constraints the variable is supposed to obey, and finally produce a value that obeys all constraints imposed.

Solving Constraints¶

To solve these constraints, one can use a Satisfiability Modulo Theories (SMT) solver. An SMT solver is built on top of a SATISFIABILITY (SAT) solver. A SAT solver is being used to check whether boolean formulas in first order logic (e.g. (a | b ) & (~a | ~b)) can be satisfied using any assignments for the variables (e.g a = true, b = false). An SMT solver extends these SAT solvers to specific background theories -- for example, theory of integers, or theory of strings. That is, given a string constraint expressed as a formula with string variables (e.g. h + t == 'hello,world'), an SMT solver that understands theory of strings can be used to check if that constraint can be satisfied, and if satisfiable, provide an instantiation of concrete values for the variables used in the formula (e.g h = 'hello,', t = 'world').

We use the SMT solver Z3 in this chapter.

In [16]:
z3_ver = z3.get_version()
In [17]:
print(z3_ver)
(4, 10, 2, 0)
In [18]:
assert z3_ver >= (4, 8, 13, 0), \
    f"Please install z3-solver 4.8.13.0 or later - you have {z3_ver}"

Let us set up Z3 first. To ensure that the string constraints we use in this chapter are successfully evaluated, we need to specify the z3str3 solver. Further, we set the timeout for Z3 computations to 30 seconds.

In [19]:
# z3.set_option('smt.string_solver', 'z3str3')
z3.set_option('timeout', 30 * 1000)  # milliseconds

To encode constraints, we need symbolic variables. Here, we make zn a placeholder for the Z3 symbolic integer variable n.

In [20]:
zn = z3.Int('n')

Remember the constraint (n < 0) from line 2 in factorial()? We can now encode the constraint as follows.

In [21]:
zn < 0
Out[21]:
n < 0

We previously traced factorial(5). We saw that with input 5, the execution took the else branch on the predicate n < 0. We can express this observation as follows.

In [22]:
z3.Not(zn < 0)
Out[22]:
¬(n < 0)

Let us now solve constraints. The z3.solve() method checks if the constraints are satisfiable; if they are, it also provides values for variables such that the constraints are satisfied. For example, we can ask Z3 for an input that will take the else branch as follows:

In [23]:
z3.solve(z3.Not(zn < 0))
[n = 0]

This is a solution (albeit a trivial one). SMT solvers can be used to solve much harder problems. For example, here is how one can solve a quadratic equation.

In [24]:
x = z3.Real('x')
eqn = (2 * x**2 - 11 * x + 5 == 0)
z3.solve(eqn)
[x = 5]

Again, this is one solution. We can ask z3 to give us another solution as follows.

In [25]:
z3.solve(x != 5, eqn)
[x = 1/2]

Indeed, both x = 5 and x = 1/2 are solutions to the quadratic equation $2x^2 -11x + 5 = 0$

Similarly, we can ask Z3 for an input that satisfies the constraint encoded in line 2 of factorial() so that we take the if branch.

In [26]:
z3.solve(zn < 0)
[n = -1]

That is, if one uses -1 as an input to factorial(), it is guaranteed to take the if branch in line 2 during execution.

Let us try using that with our coverage. Here, the -1 is the solution from above.

In [27]:
with cov as cov:
    factorial(-1)
In [28]:
to_graph(gen_cfg(inspect.getsource(factorial)), arcs=cov.arcs())
Out[28]:
1 1: enter: factorial(n) 3 2: if: n < 0 1->3 2 1: exit: factorial(n) 4 3: return None 4->2 6 6: return 1 6->2 8 9: return 1 8->2 13 16: return v 13->2 3->4 5 5: if: n == 0 3->5 5->6 7 8: if: n == 1 5->7 7->8 9 11: v = 1 7->9 10 12: while: n != 0 9->10 10->13 11 13: v = v * n 10->11 12 14: n = n - 1 12->10 11->12

Ok, so we have managed to cover a little more of the graph. Let us continue with our original input of factorial(5):

  • In line (5) we encounter a new predicate n == 0, for which we again took the false branch.
In [29]:
src[5]
Out[29]:
'    if n == 0:'

The predicates required, to follow the path until this point are as follows.

In [30]:
predicates = [z3.Not(zn < 0), z3.Not(zn == 0)]
  • If we continue to line (8), we encounter another predicate, for which again, we took the false branch
In [31]:
src[8]
Out[31]:
'    if n == 1:'

The predicates encountered so far are as follows

In [32]:
predicates = [z3.Not(zn < 0), z3.Not(zn == 0), z3.Not(zn == 1)]

To take the branch at (6), we essentially have to obey the predicates until that point, but invert the last predicate.

In [33]:
last = len(predicates) - 1
z3.solve(predicates[0:-1] + [z3.Not(predicates[-1])])
[n = 1]

What we are doing here is tracing the execution corresponding to a particular input factorial(5), using concrete values, and along with it, keeping symbolic shadow variables that enable us to capture the constraints. As we mentioned in the introduction, this particular method of execution where one tracks concrete execution using symbolic variables is called Concolic Execution.

How do we automate this process? One method is to use a similar infrastructure as that of the chapter on information flow, and use the Python inheritance to create symbolic proxy objects that can track the concrete execution.

A Concolic Tracer¶

Let us now define a class to collect symbolic variables and path conditions during an execution. The idea is to have a ConcolicTracer class that is invoked in a with block. To execute a function while tracing its path conditions, we need to transform its arguments, which we do by invoking functions through a [] item access.

This is a typical usage of a ConcolicTracer:

with ConcolicTracer as _:
    _.[function](args, ...)

After execution, we can access the symbolic variables in the decls attribute:

_.decls

whereas the path attribute lists the precondition paths encountered:

_.path

The context attribute contains a pair of declarations and paths:

_.context

If you read this for the first time, skip the implementation and head right to the examples.

Example: Triangle¶

We previously showed how to run triangle() under ConcolicTracer.

In [203]:
with ConcolicTracer() as _:
    print(_[triangle](1, 2, 3))
scalene

The symbolic variables are as follows:

In [204]:
_.decls
Out[204]:
{'triangle_a_int_1': 'Int',
 'triangle_b_int_2': 'Int',
 'triangle_c_int_3': 'Int'}

The predicates are as follows:

In [205]:
_.path
Out[205]:
[Not(triangle_a_int_1 == triangle_b_int_2),
 Not(triangle_b_int_2 == triangle_c_int_3),
 Not(triangle_a_int_1 == triangle_c_int_3)]

Using zeval(), we solve these path conditions and obtain a solution. We find that Z3 gives us three distinct integer values:

In [206]:
_.zeval()
Out[206]:
('sat',
 {'a': ('0', 'Int'), 'b': (['-', '2'], 'Int'), 'c': (['-', '1'], 'Int')})

(Note that some values may be negative. Indeed, triangle() works with negative length values, too, even if real triangles only have positive lengths.)

If we invoke triangle() with these very values, we take the exact same path as the original input:

In [207]:
triangle(0, -2, -1)
Out[207]:
'scalene'

We can have z3 negate individual conditions – and thus take different paths. First, we retrieve the symbolic variables.

In [208]:
za, zb, zc = [z3.Int(s) for s in _.decls.keys()]
In [209]:
za, zb, zc
Out[209]:
(triangle_a_int_1, triangle_b_int_2, triangle_c_int_3)

Then, we pass a negated predicate to zeval(). The key (here: 1) determines which predicate the new predicate will replace.

In [210]:
_.zeval({1: zb == zc})
Out[210]:
('sat', {'a': ('1', 'Int'), 'b': ('0', 'Int'), 'c': ('0', 'Int')})
In [211]:
triangle(1, 0, 1)
Out[211]:
'isosceles'

The updated predicate returns isosceles as expected. By negating further conditions, we can systematically explore all branches in triangle().

Example: Decoding CGI Strings¶

Let us apply ConcolicTracer on our example program cgi_decode() from the chapter on coverage. Note that we need to rewrite its code slightly, as the hash lookups in hex_values can not be used for transferring constraints yet.

In [212]:
def cgi_decode(s):
    """Decode the CGI-encoded string `s`:
       * replace "+" by " "
       * replace "%xx" by the character with hex number xx.
       Return the decoded string.  Raise `ValueError` for invalid inputs."""

    # Mapping of hex digits to their integer values
    hex_values = {
        '0': 0, '1': 1, '2': 2, '3': 3, '4': 4,
        '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,
        'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,
        'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,
    }

    t = ''
    i = 0
    while i < s.length():
        c = s[i]
        if c == '+':
            t += ' '
        elif c == '%':
            digit_high, digit_low = s[i + 1], s[i + 2]
            i = i + 2
            found = 0
            v = 0
            for key in hex_values:
                if key == digit_high:
                    found = found + 1
                    v = hex_values[key] * 16
                    break
            for key in hex_values:
                if key == digit_low:
                    found = found + 1
                    v = v + hex_values[key]
                    break
            if found == 2:
                if v >= 128:
                    # z3.StringVal(urllib.parse.unquote('%80')) <-- bug in z3
                    raise ValueError("Invalid encoding")
                t = t + chr(v)
            else:
                raise ValueError("Invalid encoding")
        else:
            t = t + c
        i = i + 1
    return t
In [213]:
with ConcolicTracer() as _:
    _[cgi_decode]('')
In [214]:
_.context
Out[214]:
({'cgi_decode_s_str_1': 'String'}, [Not(0 < Length(cgi_decode_s_str_1))])
In [215]:
with ConcolicTracer() as _:
    _[cgi_decode]('a%20d')

Once executed, we can retrieve the symbolic variables in the decls attribute. This is a mapping of symbolic variables to types.

In [216]:
_.decls
Out[216]:
{'cgi_decode_s_str_1': 'String'}

The extracted path conditions can be found in the path attribute:

In [217]:
_.path
Out[217]:
[0 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 0, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 0, 1) == "%"),
 1 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 1, 1) == "+"),
 str.substr(cgi_decode_s_str_1, 1, 1) == "%",
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "0"),
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "1"),
 str.substr(cgi_decode_s_str_1, 2, 1) == "2",
 str.substr(cgi_decode_s_str_1, 3, 1) == "0",
 4 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 4, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 4, 1) == "%"),
 Not(5 < Length(cgi_decode_s_str_1))]

The context attribute holds a pair of decls and path attributes; this is useful for passing it into the ConcolicTracer constructor.

In [218]:
assert _.context == (_.decls, _.path)

We can solve these constraints to obtain a value for the function parameters that follow the same path as the original (traced) invocation:

In [219]:
_.zeval()
Out[219]:
('sat', {'s': ('A%20B', 'String')})

Negating some of these constraints will yield different paths taken, and thus greater code coverage. This is what our concolic fuzzers (see later) do. Let us go and negate the first constraint, namely that the first character should not be a + character:

In [220]:
_.path[0]
Out[220]:
0 < Length(cgi_decode_s_str_1)

To compute the negated string, we have to construct it via z3 primitives:

In [221]:
zs = z3.String('cgi_decode_s_str_1')
In [222]:
z3.SubString(zs, 0, 1) == z3.StringVal('a')
Out[222]:
str.substr(cgi_decode_s_str_1, 0, 1) = "a"

Invoking zeval() with the path condition to be changed obtains a new input that satisfies the negated predicate:

In [223]:
(result, new_vars) = _.zeval({1: z3.SubString(zs, 0, 1) == z3.StringVal('+')})
In [224]:
new_vars
Out[224]:
{'s': ('+%20A', 'String')}
In [225]:
(new_s, new_s_type) = new_vars['s']
In [226]:
new_s
Out[226]:
'+%20A'

We can validate that new_s indeed takes the new path by re-running the tracer with new_s as input:

In [227]:
with ConcolicTracer() as _:
    _[cgi_decode](new_s)
In [228]:
_.path
Out[228]:
[0 < Length(cgi_decode_s_str_1),
 str.substr(cgi_decode_s_str_1, 0, 1) == "+",
 1 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 1, 1) == "+"),
 str.substr(cgi_decode_s_str_1, 1, 1) == "%",
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "0"),
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "1"),
 str.substr(cgi_decode_s_str_1, 2, 1) == "2",
 str.substr(cgi_decode_s_str_1, 3, 1) == "0",
 4 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 4, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 4, 1) == "%"),
 Not(5 < Length(cgi_decode_s_str_1))]

By negating further conditions, we can explore more and more code.

Example: Round¶

Here is a function that gives you the nearest ten's multiplier

In [229]:
def round10(r):
    while r % 10 != 0:
        r += 1
    return r

As before, we execute the function under the ConcolicTracer context.

In [230]:
with ConcolicTracer() as _:
    r = _[round10](1)

We verify that we were able to capture all the predicates:

In [231]:
_.context
Out[231]:
({'round10_r_int_1': 'Int'},
 [0 != round10_r_int_1%10,
  0 != (round10_r_int_1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1 + 1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1 + 1 + 1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1 + 1 + 1 + 1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1 + 1 + 1 + 1 + 1 + 1)%10,
  0 != (round10_r_int_1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1)%10,
  Not(0 !=
      (round10_r_int_1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1)%10)])

We use zeval() to obtain more inputs that take the same path.

In [232]:
_.zeval()
Out[232]:
('sat', {'r': (['-', '9'], 'Int')})

Example: Absolute Maximum¶

Do our concolic proxies work across functions? Say we have a function max_value() as below.

In [233]:
def abs_value(a):
    if a > 0:
        return a
    else:
        return -a

It is called by another function abs_max()

In [234]:
def abs_max(a, b):
    a1 = abs_value(a)
    b1 = abs_value(b)
    if a1 > b1:
        c = a1
    else:
        c = b1
    return c

Using the Concolic() context on abs_max().

In [235]:
with ConcolicTracer() as _:
    _[abs_max](2, 1)

As expected, we have the predicates across functions.

In [236]:
_.context
Out[236]:
({'abs_max_a_int_1': 'Int', 'abs_max_b_int_2': 'Int'},
 [0 < abs_max_a_int_1, 0 < abs_max_b_int_2, abs_max_a_int_1 > abs_max_b_int_2])
In [237]:
_.zeval()
Out[237]:
('sat', {'a': ('2', 'Int'), 'b': ('1', 'Int')})

Solving the predicates works as expected.

Using negative numbers as arguments so that a different branch is taken in abs_value()

In [238]:
with ConcolicTracer() as _:
    _[abs_max](-2, -1)
In [239]:
_.context
Out[239]:
({'abs_max_a_int_1': 'Int', 'abs_max_b_int_2': 'Int'},
 [Not(0 < abs_max_a_int_1),
  Not(0 < abs_max_b_int_2),
  -abs_max_a_int_1 > -abs_max_b_int_2])
In [240]:
_.zeval()
Out[240]:
('sat', {'a': (['-', '1'], 'Int'), 'b': ('0', 'Int')})

The solution reflects our predicates. (We used a > 0 in abs_value()).

Example: Binomial Coefficient¶

For a larger example that uses different kinds of variables, say we want to compute the binomial coefficient by the following formulas

$$ ^nP_k=\frac{n!}{(n-k)!} $$$$ \binom nk=\,^nC_k=\frac{^nP_k}{k!} $$

we define the functions as follows.

In [241]:
def factorial(n):  # type: ignore
    v = 1
    while n != 0:
        v *= n
        n -= 1

    return v
In [242]:
def permutation(n, k):
    return factorial(n) / factorial(n - k)
In [243]:
def combination(n, k):
    return permutation(n, k) / factorial(k)
In [244]:
def binomial(n, k):
    if n < 0 or k < 0 or n < k:
        raise Exception('Invalid values')
    return combination(n, k)

As before, we run the function under ConcolicTracer.

In [245]:
with ConcolicTracer() as _:
    v = _[binomial](4, 2)

Then call zeval() to evaluate.

In [246]:
_.zeval()
Out[246]:
('sat', {'n': ('4', 'Int'), 'k': ('2', 'Int')})

Example: Database¶

For a larger example using the Concolic String class zstr, we use the DB class from the chapter on information flow.

In [247]:
if __name__ == '__main__':
    if z3.get_version() > (4, 8, 7, 0):
        print("""Note: The following example may not work with your Z3 version;
see https://github.com/Z3Prover/z3/issues/5763 for details.
Consider `pip install z3-solver==4.8.7.0` as a workaround.""")
Note: The following example may not work with your Z3 version;
see https://github.com/Z3Prover/z3/issues/5763 for details.
Consider `pip install z3-solver==4.8.7.0` as a workaround.

We first populate our database.

In [250]:
db = sample_db()
for V in VEHICLES:
    update_inventory(db, V)
In [251]:
db.db
Out[251]:
{'inventory': ({'year': int, 'kind': str, 'company': str, 'model': str},
  [{'year': 1997, 'kind': 'van', 'company': 'Ford', 'model': 'E350'},
   {'year': 2000, 'kind': 'car', 'company': 'Mercury', 'model': 'Cougar'},
   {'year': 1999, 'kind': 'car', 'company': 'Chevy', 'model': 'Venture'}])}

We are now ready to fuzz our DB class. Hash functions are difficult to handle directly (because they rely on internal C functions). Hence we modify table() slightly.

In [252]:
class ConcolicDB(DB):
    def table(self, t_name):
        for k, v in self.db:
            if t_name == k:
                return v
        raise SQLException('Table (%s) was not found' % repr(t_name))

    def column(self, decl, c_name):
        for k in decl:
            if c_name == k:
                return decl[k]
        raise SQLException('Column (%s) was not found' % repr(c_name))

To make it easy, we define a single function db_select() that directly invokes db.sql().

In [253]:
def db_select(s):
    my_db = ConcolicDB()
    my_db.db = [(k, v) for (k, v) in db.db.items()]
    r = my_db.sql(s)
    return r

We now want to run SQL statements under our ConcolicTracer, and collect predicates obtained.

In [254]:
with ConcolicTracer() as _:
    _[db_select]('select kind from inventory')

The predicates encountered during the execution are as follows:

In [255]:
_.path
Out[255]:
[0 == IndexOf(db_select_s_str_1, "select ", 0),
 0 == IndexOf(db_select_s_str_1, "select ", 0),
 Not(0 >
     IndexOf(str.substr(db_select_s_str_1, 7, 19),
             " from ",
             0)),
 Not(Or(0 <
        IndexOf(str.substr(db_select_s_str_1, 7, 19),
                " where ",
                0),
        0 ==
        IndexOf(str.substr(db_select_s_str_1, 7, 19),
                " where ",
                0))),
 str.substr(str.substr(db_select_s_str_1, 7, 19), 10, 9) ==
 "inventory"]

We can use zeval() as before to solve the constraints.

In [256]:
_.zeval()
Out[256]:
('Gave up', None)

Fuzzing with Constraints¶

The SimpleConcolicFuzzer class starts with a sample input generated by some other fuzzer. It then runs the function being tested under ConcolicTracer, and collects the path predicates. It then negates random predicates within the path and solves it with Z3 to produce a new output that is guaranteed to take a different path than the original.

As with ConcolicTracer, above, please first look at the examples before digging into the implementation.

To illustrate SimpleConcolicFuzzer, let us apply it on our example program cgi_decode() from the Coverage chapter. Note that we cannot use it directly as the hash lookups in hex_values can not be used for transferring constraints yet.

In [293]:
with ConcolicTracer() as _:
    _[cgi_decode]('a+c')
In [294]:
_.path
Out[294]:
[0 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 0, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 0, 1) == "%"),
 1 < Length(cgi_decode_s_str_1),
 str.substr(cgi_decode_s_str_1, 1, 1) == "+",
 2 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "%"),
 Not(3 < Length(cgi_decode_s_str_1))]
In [295]:
scf = SimpleConcolicFuzzer()
scf.add_trace(_, 'a+c')

The trace tree shows the path conditions encountered so far. Any blue edge towards a "?" implies that there is a path not yet taken.

In [296]:
display_trace_tree(scf.ct.root)
Out[296]:
0 (0) Length(cgi_decode_s_str_1) <= 0 1 (1) str.substr(cgi_decode_s_str_1, 0, 1) == "+" 0->1 0 18 ? (63) 0->18 1 2 (2) str.substr(cgi_decode_s_str_1, 0, 1) == "%" 1->2 0 17 ? (63) 1->17 1 3 (3) Length(cgi_decode_s_str_1) <= 1 2->3 0 16 ? (63) 2->16 1 4 (4) str.substr(cgi_decode_s_str_1, 1, 1) == "+" 3->4 0 15 ? (63) 3->15 1 5 ? (63) 4->5 0 6 (5) Length(cgi_decode_s_str_1) <= 2 4->6 1 7 (6) str.substr(cgi_decode_s_str_1, 2, 1) == "+" 6->7 0 14 ? (63) 6->14 1 8 (7) str.substr(cgi_decode_s_str_1, 2, 1) == "%" 7->8 0 13 ? (63) 7->13 1 9 (8) Length(cgi_decode_s_str_1) <= 3 8->9 0 12 ? (63) 8->12 1 10 ? (63) 9->10 0 11 * a+c 9->11 1

So, we fuzz to get a new path that is not empty.

In [297]:
v = scf.fuzz()
print(v)
A+

We can now obtain the new trace as before.

In [298]:
with ExpectError():
    with ConcolicTracer() as _:
        _[cgi_decode](v)

The new trace is added to our fuzzer using add_trace()

In [299]:
scf.add_trace(_, v)

The updated binary tree is as follows. Note the difference between the child nodes of Root node.

In [300]:
display_trace_tree(scf.ct.root)
Out[300]:
0 (0) Length(cgi_decode_s_str_1) <= 0 1 (1) str.substr(cgi_decode_s_str_1, 0, 1) == "+" 0->1 0 18 ? (63) 0->18 1 2 (2) str.substr(cgi_decode_s_str_1, 0, 1) == "%" 1->2 0 17 ? (63) 1->17 1 3 (3) Length(cgi_decode_s_str_1) <= 1 2->3 0 16 ? (63) 2->16 1 4 (4) str.substr(cgi_decode_s_str_1, 1, 1) == "+" 3->4 0 15 ? (63) 3->15 1 5 ? (63) 4->5 0 6 (5) Length(cgi_decode_s_str_1) <= 2 4->6 1 7 (6) str.substr(cgi_decode_s_str_1, 2, 1) == "+" 6->7 0 14 * A+ 6->14 1 8 (7) str.substr(cgi_decode_s_str_1, 2, 1) == "%" 7->8 0 13 ? (63) 7->13 1 9 (8) Length(cgi_decode_s_str_1) <= 3 8->9 0 12 ? (63) 8->12 1 10 ? (63) 9->10 0 11 * a+c 9->11 1

A complete fuzzer run is as follows:

In [301]:
scf = SimpleConcolicFuzzer()
for i in range(10):
    v = scf.fuzz()
    print(repr(v))
    if v is None:
        continue
    with ConcolicTracer() as _:
        with ExpectError(print_traceback=False):
            # z3.StringVal(urllib.parse.unquote('%80')) <-- bug in z3
            _[cgi_decode](v)
    scf.add_trace(_, v)
' '
''
'+'
'%'
'+A'
'++'
'AB'
'++A'
'A%'
'+AB'
IndexError: string index out of range (expected)
IndexError: string index out of range (expected)
In [302]:
display_trace_tree(scf.ct.root)
Out[302]:
0 (0) Length(cgi_decode_s_str_1) <= 0 1 (1) str.substr(cgi_decode_s_str_1, 0, 1) == "+" 0->1 0 36 * 0->36 1 2 (2) str.substr(cgi_decode_s_str_1, 0, 1) == "%" 1->2 0 13 (2) Length(cgi_decode_s_str_1) <= 1 1->13 1 3 (3) Length(cgi_decode_s_str_1) <= 1 2->3 0 12 * % 2->12 1 4 (4) str.substr(cgi_decode_s_str_1, 1, 1) == "+" 3->4 0 11 *   3->11 1 5 (5) str.substr(cgi_decode_s_str_1, 1, 1) == "%" 4->5 0 10 ? (63) 4->10 1 6 (6) Length(cgi_decode_s_str_1) <= 2 5->6 0 9 * A% 5->9 1 7 ? (63) 6->7 0 8 * AB 6->8 1 14 (3) str.substr(cgi_decode_s_str_1, 1, 1) == "+" 13->14 0 35 * + 13->35 1 15 (4) str.substr(cgi_decode_s_str_1, 1, 1) == "%" 14->15 0 26 (4) Length(cgi_decode_s_str_1) <= 2 14->26 1 16 (5) Length(cgi_decode_s_str_1) <= 2 15->16 0 25 ? (63) 15->25 1 17 (6) str.substr(cgi_decode_s_str_1, 2, 1) == "+" 16->17 0 24 * +A 16->24 1 18 (7) str.substr(cgi_decode_s_str_1, 2, 1) == "%" 17->18 0 23 ? (63) 17->23 1 19 (8) Length(cgi_decode_s_str_1) <= 3 18->19 0 22 ? (63) 18->22 1 20 ? (63) 19->20 0 21 * +AB 19->21 1 27 (5) str.substr(cgi_decode_s_str_1, 2, 1) == "+" 26->27 0 34 * ++ 26->34 1 28 (6) str.substr(cgi_decode_s_str_1, 2, 1) == "%" 27->28 0 33 ? (63) 27->33 1 29 (7) Length(cgi_decode_s_str_1) <= 3 28->29 0 32 ? (63) 28->32 1 30 ? (63) 29->30 0 31 * ++A 29->31 1

Note. Our concolic tracer is limited in that it does not track changes in the string length. This leads it to treat every string with same prefix as the same string.

The SimpleConcolicFuzzer is reasonably efficient at exploring paths near the path followed by a given sample input. However, it is not very intelligent when it comes to choosing which paths to follow. We look at another fuzzer that lifts the predicates obtained to the grammar and achieves better fuzzing.

Concolic Grammar Fuzzing¶

The concolic framework can be used directly in grammar-based fuzzing. We implement a class ConcolicGrammarFuzzer wihich does this.

The ConcolicGrammarFuzzer is used as follows.

In [336]:
cgf = ConcolicGrammarFuzzer(INVENTORY_GRAMMAR)
cgf.prune_tokens(prune_tokens)
for i in range(10):
    query = cgf.fuzz()
    print(query)
    with ConcolicTracer() as _:
        with ExpectError(print_traceback=False):
            try:
                res = _[db_select](query)
                print(repr(res))
            except SQLException as e:
                print(e)
        cgf.update_grammar(_)
        print()
select Qq6L,(X) from LYg0 where ((x<w))!=(A)
Table ('LYg0') was not found

update a set P3=_ where p/h-g-Z<l-Q(U)
Table ('a') was not found

select W,H,s from vehicles where I+N/S+k/R!=G2

insert into months (S,q1i) values (7.3,'3[s=K=','e')
Column ('S') was not found

delete from vehicles where v-f*r/s/q>h-K-m(n,X)
Invalid WHERE ('v-f*r/s/q>h-K-m(n,X)')

select C*R*Y(A)/Z<J,(q)!=:(R),D from C
Table ('C') was not found

delete from months where K-t/W(E)-Y+A<H+I*U+w
Invalid WHERE ('K-t/W(E)-Y+A<H+I*U+w')

select e*L*G-A/_ from _3 where (G)==B(F,H)
Table ('_3') was not found

select S(Y)<c,PF(j),h,s,_ from vehicles
Invalid WHERE ('(S(Y)<c,PF(j),h,s,_)')

update e set m=:LMG where 6.48!=A+C-l+c<K(_)*f/o+h==H
Table ('e') was not found

TypeError: 'NotImplementedType' object is not callable (expected)

As can be seen, the fuzzer starts with no knowledge of the tables vehicles, months and years, but identifies it from the concolic execution, and lifts it to the grammar. This allows us to improve the effectiveness of fuzzing.

Limitations¶

As with dynamic taint analysis, implicit control flow can obscure the predicates encountered during concolic execution. However, this limitation could be overcome to some extent by wrapping any constants in the source with their respective proxy objects. Similarly, calls to internal C functions can cause the symbolic information to be discarded, and only partial information may be obtained.

Synopsis¶

This chapter defines two main classes: SimpleConcolicFuzzer and ConcolicGrammarFuzzer. The SimpleConcolicFuzzer first uses a sample input to collect predicates encountered. The fuzzer then negates random predicates to generate new input constraints. These, when solved, produce inputs that explore paths that are close to the original path.

ConcolicTracer¶

At the heart of both fuzzers lies the concept of a concolic tracer, capturing symbolic variables and path conditions as a program gets executed.

ConcolicTracer is used in a with block; the syntax tracer[function] executes function within the tracer while capturing conditions. Here is an example for the cgi_decode() function:

In [337]:
with ConcolicTracer() as _:
    _[cgi_decode]('a%20d')

Once executed, we can retrieve the symbolic variables in the decls attribute. This is a mapping of symbolic variables to types.

In [338]:
_.decls
Out[338]:
{'cgi_decode_s_str_1': 'String'}

The extracted path conditions can be found in the path attribute:

In [339]:
_.path
Out[339]:
[0 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 0, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 0, 1) == "%"),
 1 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 1, 1) == "+"),
 str.substr(cgi_decode_s_str_1, 1, 1) == "%",
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "0"),
 Not(str.substr(cgi_decode_s_str_1, 2, 1) == "1"),
 str.substr(cgi_decode_s_str_1, 2, 1) == "2",
 str.substr(cgi_decode_s_str_1, 3, 1) == "0",
 4 < Length(cgi_decode_s_str_1),
 Not(str.substr(cgi_decode_s_str_1, 4, 1) == "+"),
 Not(str.substr(cgi_decode_s_str_1, 4, 1) == "%"),
 Not(5 < Length(cgi_decode_s_str_1))]

The context attribute holds a pair of decls and path attributes; this is useful for passing it into the ConcolicTracer constructor.

In [340]:
assert _.context == (_.decls, _.path)

We can solve these constraints to obtain a value for the function parameters that follow the same path as the original (traced) invocation:

In [341]:
_.zeval()
Out[341]:
('sat', {'s': ('A%20B', 'String')})

The zeval() function also allows passing alternate or negated constraints. See the chapter for examples.

In [342]:
# ignore
from ClassDiagram import display_class_hierarchy
display_class_hierarchy(ConcolicTracer)
Out[342]:
ConcolicTracer ConcolicTracer __call__() __init__() zeval() __enter__() __exit__() __getitem__() concolic() smt_expr() Legend Legend •  public_method() •  private_method() •  overloaded_method() Hover over names to see doc

SimpleConcolicFuzzer¶

The constraints obtained from ConcolicTracer are added to the concolic fuzzer as follows:

In [343]:
scf = SimpleConcolicFuzzer()
scf.add_trace(_, 'a%20d')

The concolic fuzzer then uses the constraints added to guide its fuzzing as follows:

In [344]:
scf = SimpleConcolicFuzzer()
for i in range(20):
    v = scf.fuzz()
    if v is None:
        break
    print(repr(v))
    with ExpectError(print_traceback=False):
        with ConcolicTracer() as _:
            _[cgi_decode](v)
    scf.add_trace(_, v)
' '
'%'
'AB'
''
'ABC'
'A'
'AB+'
'AB'
'ABCD'
IndexError: string index out of range (expected)
'ABC+'
'A'
'ABC'
'ABC%'
'A%'
'ABC+DE'
'AB'
'AB+'
IndexError: string index out of range (expected)
IndexError: string index out of range (expected)
'A'
'ABCD'
'A'

We see how the additional inputs generated explore additional paths.

In [345]:
# ignore
display_class_hierarchy(SimpleConcolicFuzzer)
Out[345]:
SimpleConcolicFuzzer SimpleConcolicFuzzer __init__() fuzz() add_trace() get_newpath() next_choice() Fuzzer Fuzzer __init__() fuzz() run() runs() SimpleConcolicFuzzer->Fuzzer Legend Legend •  public_method() •  private_method() •  overloaded_method() Hover over names to see doc

ConcolicGrammarFuzzer¶

The SimpleConcolicFuzzer simply explores all paths near the original path traversed by the sample input. It uses a simple mechanism to explore the paths that are near the paths that it knows about, and other than code paths, knows nothing about the input.

The ConcolicGrammarFuzzer on the other hand, knows about the input grammar, and can collect feedback from the subject under fuzzing. It can lift some constraints encountered to the grammar, enabling deeper fuzzing. It is used as follows:

In [347]:
cgf = ConcolicGrammarFuzzer(INVENTORY_GRAMMAR)
cgf.prune_tokens(prune_tokens)
for i in range(10):
    query = cgf.fuzz()
    print(query)
    with ConcolicTracer() as _:
        with ExpectError(print_traceback=False):
            try:
                res = _[db_select](query)
                print(repr(res))
            except SQLException as e:
                print(e)
        cgf.update_grammar(_)
        print()
insert into W (Ru_2,.Wj186518W8) values ('@','}','h')
Table ('W') was not found

select S>R(j),A from C3 where U4==9249
Table ('C3') was not found

select I/I*U/n1(M),T/E*d(S) from vehicles
Invalid WHERE ('(I/I*U/n1(M),T/E*d(S))')

select (v==X),t,h,E from months where r8(w)<D-e

select e!=K,X from a25i where G/S-y<h/P
Table ('a25i') was not found

select C,: from months where s*u!=W(Y)>B/P(g)

select x/z+.(L)-h from months where -9!=Y>G(A)

delete from h4OB60J where K-w/M<t*N/A*S
Table ('h4OB60J') was not found

delete from months where r/v+z*Y+A-k<(q<h)+y
Invalid WHERE ('r/v+z*Y+A-k<(q<h)+y')

select (V==b),(C>A) from vehicles where B(e,R)>D

TypeError: 'NotImplementedType' object is not callable (expected)
TypeError: 'NotImplementedType' object is not callable (expected)
TypeError: 'NotImplementedType' object is not callable (expected)
TypeError: 'NotImplementedType' object is not callable (expected)
In [348]:
# ignore
display_class_hierarchy(ConcolicGrammarFuzzer)
Out[348]:
ConcolicGrammarFuzzer ConcolicGrammarFuzzer fuzz() coalesce() prune_tokens() prune_tree() tree_to_string() update_grammar() GrammarFuzzer GrammarFuzzer __init__() check_grammar() choose_node_expansion() choose_tree_expansion() expand_node_randomly() expand_tree() expand_tree_once() expand_tree_with_strategy() fuzz() fuzz_tree() log_tree() process_chosen_children() supported_opts() ConcolicGrammarFuzzer->GrammarFuzzer Fuzzer Fuzzer __init__() fuzz() run() runs() GrammarFuzzer->Fuzzer Legend Legend •  public_method() •  private_method() •  overloaded_method() Hover over names to see doc

Lessons Learned¶

  • Concolic execution can often provide more information than taint analysis with respect to the program behavior. However, this comes at a much larger runtime cost. Hence, unlike taint analysis, real-time analysis is often not possible.

  • Similar to taint analysis, concolic execution also suffers from limitations such as indirect control flow and internal function calls.

  • Predicates from concolic execution can be used in conjunction with fuzzing to provide an even more robust indication of incorrect behavior than taints, and can be used to create grammars that are better at producing valid inputs.

Next Steps¶

A costlier but stronger alternative to concolic fuzzing is symbolic fuzzing. Similarly, search based fuzzing can often provide a cheaper exploration strategy than relying on SMT solvers to provide inputs slightly different from the current path.

Background¶

The technique of concolic execution was originally used to inform and expand the scope of symbolic execution \cite{king1976symbolic}, a static analysis technique for program analysis. Laron et al. cite{Larson2003} was the first to use the concolic execution technique.

The idea of using proxy objects for collecting constraints was pioneered by Cadar et al. \cite{cadar2005execution}. The concolic execution technique for Python programs used in this chapter was pioneered by PeerCheck \cite{PeerCheck}, and Python Error Finder \cite{Barsotti2018}.

Exercises¶

Exercise 1: Implment a Concolic Float Proxy Class¶

While implementing the zint binary operators, we asserted that the results were int. However, that need not be the case. For example, division can result in float. Hence, we need proxy objects for float. Can you implement a similar proxy object for float and fix the zint binary operator definition?

Solution. The solution is as follows.

As in the case of zint, we first open up zfloat for extension.

In [349]:
class zfloat(float):
    def __new__(cls, context, zn, v, *args, **kw):
        return float.__new__(cls, v, *args, **kw)

We then implement the initialization methods.

In [350]:
class zfloat(zfloat):
    @classmethod
    def create(cls, context, zn, v=None):
        return zproxy_create(cls, 'Real', z3.Real, context, zn, v)

    def __init__(self, context, z, v=None):
        self.z, self.v = z, v
        self.context = context

The helper for when one of the arguments in a binary operation is not float.

In [351]:
class zfloat(zfloat):
    def _zv(self, o):
        return (o.z, o.v) if isinstance(o, zfloat) else (z3.RealVal(o), o)

Coerce float into bool value for use in conditionals.

In [352]:
class zfloat(zfloat):
    def __bool__(self):
        # force registering boolean condition
        if self != 0.0:
            return True
        return False

Define the common proxy method for comparison methods

In [353]:
def make_float_bool_wrapper(fname, fun, zfun):
    def proxy(self, other):
        z, v = self._zv(other)
        z_ = zfun(self.z, z)
        v_ = fun(self.v, v)
        return zbool(self.context, z_, v_)

    return proxy

We apply the comparison methods on the defined zfloat class.

In [354]:
FLOAT_BOOL_OPS = [
    '__eq__',
    # '__req__',
    '__ne__',
    # '__rne__',
    '__gt__',
    '__lt__',
    '__le__',
    '__ge__',
]
In [355]:
for fname in FLOAT_BOOL_OPS:
    fun = getattr(float, fname)
    zfun = getattr(z3.ArithRef, fname)
    setattr(zfloat, fname, make_float_bool_wrapper(fname, fun, zfun))

Similarly, we define the common proxy method for binary operators.

In [356]:
def make_float_binary_wrapper(fname, fun, zfun):
    def proxy(self, other):
        z, v = self._zv(other)
        z_ = zfun(self.z, z)
        v_ = fun(self.v, v)
        return zfloat(self.context, z_, v_)

    return proxy

And apply them on zfloat

In [357]:
FLOAT_BINARY_OPS = [
    '__add__',
    '__sub__',
    '__mul__',
    '__truediv__',
    # '__div__',
    '__mod__',
    # '__divmod__',
    '__pow__',
    # '__lshift__',
    # '__rshift__',
    # '__and__',
    # '__xor__',
    # '__or__',
    '__radd__',
    '__rsub__',
    '__rmul__',
    '__rtruediv__',
    # '__rdiv__',
    '__rmod__',
    # '__rdivmod__',
    '__rpow__',
    # '__rlshift__',
    # '__rrshift__',
    # '__rand__',
    # '__rxor__',
    # '__ror__',
]
In [358]:
for fname in FLOAT_BINARY_OPS:
    fun = getattr(float, fname)
    zfun = getattr(z3.ArithRef, fname)
    setattr(zfloat, fname, make_float_binary_wrapper(fname, fun, zfun))

These are used as follows.

In [359]:
with ConcolicTracer() as _:
    za = zfloat.create(_.context, 'float_a', 1.0)
    zb = zfloat.create(_.context, 'float_b', 0.0)
    if za * zb:
        print(1)
In [360]:
_.context
Out[360]:
({'float_a': 'Real', 'float_b': 'Real'}, [Not(float_a*float_b != 0)])

Finally, we fix the zint binary wrapper to correctly create zfloat when needed.

In [361]:
def make_int_binary_wrapper(fname, fun, zfun):  # type: ignore
    def proxy(self, other):
        z, v = self._zv(other)
        z_ = zfun(self.z, z)
        v_ = fun(self.v, v)
        if isinstance(v_, float):
            return zfloat(self.context, z_, v_)
        elif isinstance(v_, int):
            return zint(self.context, z_, v_)
        else:
            assert False

    return proxy
In [362]:
for fname in INT_BINARY_OPS:
    fun = getattr(int, fname)
    zfun = getattr(z3.ArithRef, fname)
    setattr(zint, fname, make_int_binary_wrapper(fname, fun, zfun))

Checking whether it worked as expected.

In [363]:
with ConcolicTracer() as _:
    v = _[binomial](4, 2)
In [364]:
_.zeval()
Out[364]:
('sat', {'n': ('4', 'Int'), 'k': ('2', 'Int')})

Exercise 2: Bit Manipulation¶

Similar to floats, implementing the bit manipulation functions such as xor involves converting int to its bit vector equivalents, performing operations on them, and converting it back to the original type. Can you implement the bit manipulation operations for zint?

Solution. The solution is as follows.

We first define the proxy method as before.

In [365]:
def make_int_bit_wrapper(fname, fun, zfun):
    def proxy(self, other):
        z, v = self._zv(other)
        z_ = z3.BV2Int(
            zfun(
                z3.Int2BV(
                    self.z, num_bits=64), z3.Int2BV(
                    z, num_bits=64)))
        v_ = fun(self.v, v)
        return zint(self.context, z_, v_)

    return proxy

It is then applied to the zint class.

In [366]:
BIT_OPS = [
    '__lshift__',
    '__rshift__',
    '__and__',
    '__xor__',
    '__or__',
    '__rlshift__',
    '__rrshift__',
    '__rand__',
    '__rxor__',
    '__ror__',
]
In [367]:
def init_concolic_4():
    for fname in BIT_OPS:
        fun = getattr(int, fname)
        zfun = getattr(z3.BitVecRef, fname)
        setattr(zint, fname, make_int_bit_wrapper(fname, fun, zfun))
In [368]:
INITIALIZER_LIST.append(init_concolic_4)
In [369]:
init_concolic_4()

Invert is the only unary bit manipulation method.

In [370]:
class zint(zint):
    def __invert__(self):
        return zint(self.context, z3.BV2Int(
            ~z3.Int2BV(self.z, num_bits=64)), ~self.v)

The my_fn() computes xor and returns True if the xor results in a non-zero value.

In [371]:
def my_fn(a, b):
    o_ = (a | b)
    a_ = (a & b)
    if o_ & ~a_:
        return True
    else:
        return False

Using that under ConcolicTracer

In [372]:
with ConcolicTracer() as _:
    print(_[my_fn](2, 1))
True

We log the computed SMT expression to verify that everything went well.

In [373]:
_.zeval(log=True)
Predicates in path:
0 0 !=
BV2Int(int2bv(BV2Int(int2bv(my_fn_a_int_1) |
                     int2bv(my_fn_b_int_2))) &
       int2bv(BV2Int(~int2bv(BV2Int(int2bv(my_fn_a_int_1) &
                                    int2bv(my_fn_b_int_2))))))

(declare-const my_fn_a_int_1 Int)
(declare-const my_fn_b_int_2 Int)
(assert (let ((a!1 (bvnot (bvor (bvnot ((_ int2bv 64) my_fn_a_int_1))
                        (bvnot ((_ int2bv 64) my_fn_b_int_2))))))
(let ((a!2 (bvor (bvnot (bvor ((_ int2bv 64) my_fn_a_int_1)
                              ((_ int2bv 64) my_fn_b_int_2)))
                 a!1)))
  (not (= 0 (bv2int (bvnot a!2)))))))
(check-sat)
(get-model)

z3 -t:6000 /var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/tmp350zoite.smt
sat
(
  (define-fun my_fn_a_int_1 () Int
    (- 1))
  (define-fun my_fn_b_int_2 () Int
    (- 9223372036854775809))
)
Out[373]:
('sat', {'a': (['-', '1'], 'Int'), 'b': (['-', '9223372036854775809'], 'Int')})

We can confirm from the formulas generated that the bit manipulation functions worked correctly.

Exercise 3: String Translation Functions¶

We have seen how to define upper() and lower(). Can you define the capitalize(), title(), and swapcase() methods?

Solution. Solution not yet available.