Fuzzing APIs

So far, we have always generated system input, i.e. data that the program as a whole obtains via its input channels. However, we can also generate inputs that go directly into individual functions, gaining flexibility and speed in the process. In this chapter, we explore the use of grammars to synthesize code for function calls, which allows you to generate program code that very efficiently invokes functions directly.


Fuzzing a Function

Let us start with our first problem: How do we fuzz a given function? For an interpreted language like Python, this is pretty straight-forward. All we need to do is to generate calls to the function(s) we want to test. This is something we can easily do with a grammar.

As an example, consider the urlparse() function from the Python library. urlparse() takes a URL and decomposes it into its individual components.

from urllib.parse import urlparse
ParseResult(scheme='https', netloc='www.fuzzingbook.com', path='/html/APIFuzzer.html', params='', query='', fragment='')

You see how the individual elements of the URL – the scheme ("http"), the network location ("www.fuzzingbook.com"), or the path ("//html/APIFuzzer.html") are all properly identified. Other elements (like params, query, or fragment) are empty, because they were not part of our input.

To test urlparse(), we'd want to feed it a large set of different URLs. We can obtain these from the URL grammar we had defined in the "Grammars" chapter.

from Grammars import URL_GRAMMAR, is_valid_grammar, START_SYMBOL, new_symbol, opts, extend_grammar
from GrammarFuzzer import GrammarFuzzer, display_tree, all_terminals
url_fuzzer = GrammarFuzzer(URL_GRAMMAR)
for i in range(10):
    url = url_fuzzer.fuzz()
ParseResult(scheme='https', netloc='user:password@cispa.saarland:8080', path='/', params='', query='', fragment='')
ParseResult(scheme='http', netloc='cispa.saarland:1', path='/', params='', query='', fragment='')
ParseResult(scheme='https', netloc='fuzzingbook.com:7', path='', params='', query='', fragment='')
ParseResult(scheme='https', netloc='user:password@cispa.saarland:80', path='', params='', query='', fragment='')
ParseResult(scheme='ftps', netloc='user:password@fuzzingbook.com', path='', params='', query='', fragment='')
ParseResult(scheme='ftp', netloc='fuzzingbook.com', path='/abc', params='', query='abc=x31&def=x20', fragment='')
ParseResult(scheme='ftp', netloc='user:password@fuzzingbook.com', path='', params='', query='', fragment='')
ParseResult(scheme='https', netloc='www.google.com:80', path='/', params='', query='', fragment='')
ParseResult(scheme='http', netloc='fuzzingbook.com:52', path='/', params='', query='', fragment='')
ParseResult(scheme='ftps', netloc='user:password@cispa.saarland', path='', params='', query='', fragment='')

This way, we can easily test any Python function – by setting up a scaffold that runs it. How would we proceed, though, if we wanted to have a test that can be re-run again and again, without having to generate new calls every time?

Synthesizing Code

The "scaffolding" method, as sketched above, has an important downside: It couples test generation and test execution into a single unit, disallowing running both at different times, or for different languages. To decouple the two, we take another approach: Rather than generating inputs and immediately feeding this input into a function, we synthesize code instead that invokes functions with a given input.

For instance, if we generate the string

call = "urlparse('http://www.example.com/')"

we can execute this string as a whole (and thus run the test) at any time:

ParseResult(scheme='http', netloc='www.example.com', path='/', params='', query='', fragment='')

To systematically generate such calls, we can again use a grammar:


# Import definitions from URL_GRAMMAR
URLPARSE_GRAMMAR["<start>"] = ["<call>"]

assert is_valid_grammar(URLPARSE_GRAMMAR)

This grammar creates calls in the form urlparse(<url>), where <url> comes from the "imported" URL grammar. The idea is to create many of these calls and to feed them into the Python interpreter.

{'<call>': ['urlparse("<url>")'],
 '<start>': ['<call>'],
 '<url>': ['<scheme>://<authority><path><query>'],
 '<scheme>': ['http', 'https', 'ftp', 'ftps'],
 '<authority>': ['<host>',
 '<host>': ['cispa.saarland', 'www.google.com', 'fuzzingbook.com'],
 '<port>': ['80', '8080', '<nat>'],
 '<nat>': ['<digit>', '<digit><digit>'],
 '<digit>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
 '<userinfo>': ['user:password'],
 '<path>': ['', '/', '/<id>'],
 '<id>': ['abc', 'def', 'x<digit><digit>'],
 '<query>': ['', '?<params>'],
 '<params>': ['<param>', '<param>&<params>'],
 '<param>': ['<id>=<id>', '<id>=<nat>']}

We can now use this grammar for fuzzing and synthesizing calls to urlparse):

urlparse_fuzzer = GrammarFuzzer(URLPARSE_GRAMMAR)

Just as above, we can immediately execute these calls. To better see what is happening, we define a small helper function:

# Call function_name(arg[0], arg[1], ...) as a string
def do_call(call_string):
    result = eval(call_string)
    print("\t= " + repr(result))
    return result
call = urlparse_fuzzer.fuzz()
	= ParseResult(scheme='http', netloc='www.google.com', path='', params='', query='abc=def', fragment='')
ParseResult(scheme='http', netloc='www.google.com', path='', params='', query='abc=def', fragment='')

If urlparse() were a C function, for instance, we could embed its call into some (also generated) C function:

    "<cfile>": ["<cheader><cfunction>"],
    "<cheader>": ['#include "urlparse.h"\n\n'],
    "<cfunction>": ["void test() {\n<calls>}\n"],
    "<calls>": ["<call>", "<calls><call>"],
    "<call>": ['    urlparse("<url>");\n']
URLPARSE_C_GRAMMAR["<start>"] = ["<cfile>"]
assert is_valid_grammar(URLPARSE_C_GRAMMAR)
urlparse_fuzzer = GrammarFuzzer(URLPARSE_C_GRAMMAR)
#include "urlparse.h"

void test() {

Synthesizing Oracles

In our urlparse() example, both the Python as well as the C variant only check for generic errors in urlparse(); that is, they only detect fatal errors and exceptions. For a full test, we need to set up a specific oracle as well that checks whether the result is valid.

Our plan is to check whether specific parts of the URL reappear in the result – that is, if the scheme is http:, then the ParseResult returned should also contain a http: scheme. As discussed in the chapter on fuzzing with generators, equalities of strings such as http: across two symbols cannot be expressed in a context-free grammar. We can, however, use a generator function (also introduced in the chapter on fuzzing with generators) to automatically enforce such equalities.

Here is an example. Invoking geturl() on a urlparse() result should return the URL as originally passed to urlparse().

from GeneratorGrammarFuzzer import GeneratorGrammarFuzzer, ProbabilisticGeneratorGrammarFuzzer
     "<call>": [("assert urlparse('<url>').geturl() == '<url>'",
                 opts(post=lambda url_1, url_2: [None, url_1]))]
urlparse_oracle_fuzzer = GeneratorGrammarFuzzer(URLPARSE_ORACLE_GRAMMAR)
test = urlparse_oracle_fuzzer.fuzz()
assert urlparse('https://user:password@cispa.saarland/abc?abc=abc').geturl() == 'https://user:password@cispa.saarland/abc?abc=abc'

In a similar way, we can also check individual components of the result:

     "<call>": [("result = urlparse('<scheme>://<host><path>?<params>')\n"
                 # + "print(result)\n"
                 + "assert result.scheme == '<scheme>'\n"
                 + "assert result.netloc == '<host>'\n"
                 + "assert result.path == '<path>'\n"
                 + "assert result.query == '<params>'",
                 opts(post=lambda scheme_1, authority_1, path_1, params_1,
                      scheme_2, authority_2, path_2, params_2:
                      [None, None, None, None,
                       scheme_1, authority_1, path_1, params_1]))]

# Get rid of unused symbols
del URLPARSE_ORACLE_GRAMMAR["<authority>"]
urlparse_oracle_fuzzer = GeneratorGrammarFuzzer(URLPARSE_ORACLE_GRAMMAR)
test = urlparse_oracle_fuzzer.fuzz()
result = urlparse('https://www.google.com/?def=18&abc=abc')
assert result.scheme == 'https'
assert result.netloc == 'www.google.com'
assert result.path == '/'
assert result.query == 'def=18&abc=abc'

The use of generator functions may feel a bit cumbersome. Indeed, if we uniquely stick to Python, we could also create a unit test that directly invokes the fuzzer to generate individual parts:

def fuzzed_url_element(symbol):
    return GrammarFuzzer(URLPARSE_GRAMMAR, start_symbol=symbol).fuzz()
scheme = fuzzed_url_element("<scheme>")
authority = fuzzed_url_element("<authority>")
path = fuzzed_url_element("<path>")
query = fuzzed_url_element("<params>")
url = "%s://%s%s?%s" % (scheme, authority, path, query)
result = urlparse(url)
# print(result)
assert result.geturl() == url
assert result.scheme == scheme
assert result.path == path
assert result.query == query

Using such a unit test makes it easier to express oracles. However, we lose the ability to systematically cover individual URL elements and alternatives as with GrammarCoverageFuzzer as well as the ability to guide generation towards specific elements as with ProbabilisticGrammarFuzzer. Furthermore, a grammar allows us to generate tests for arbitrary programming languages and APIs.

Synthesizing Data

For urlparse(), we have used a very specific grammar for creating a very specific argument. Many functions take basic data types as (some) arguments, though; we therefore define grammars that generate precisely those arguments. Even better, we can define functions that generate grammars tailored towards our specific needs, returning values in a particular range, for instance.


We introduce a simple grammar to produce integers.

from Grammars import convert_ebnf_grammar, crange
from ProbabilisticGrammarFuzzer import ProbabilisticGrammarFuzzer
    "<start>": ["<int>"],
    "<int>": ["<_int>"],
    "<_int>": ["(-)?<leaddigit><digit>*"],
    "<leaddigit>": crange('1', '9'),
    "<digit>": crange('0', '9')

assert is_valid_grammar(INT_EBNF_GRAMMAR)
INT_GRAMMAR = convert_ebnf_grammar(INT_EBNF_GRAMMAR)
{'<start>': ['<int>'],
 '<int>': ['<_int>'],
 '<_int>': ['<symbol-1><leaddigit><digit-1>'],
 '<leaddigit>': ['1', '2', '3', '4', '5', '6', '7', '8', '9'],
 '<digit>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
 '<symbol>': ['-'],
 '<symbol-1>': ['', '<symbol>'],
 '<digit-1>': ['', '<digit><digit-1>']}
int_fuzzer = GrammarFuzzer(INT_GRAMMAR)
print([int_fuzzer.fuzz() for i in range(10)])
['699', '-44', '321', '-7', '-6', '67', '9', '87', '32', '1']

If we need integers in a specific range, we can add a generator function that does right that:

from Grammars import set_opts
import random
def int_grammar_with_range(start, end):
    int_grammar = extend_grammar(INT_GRAMMAR)
    set_opts(int_grammar, "<int>", "<_int>",
        opts(pre=lambda: random.randint(start, end)))
    return int_grammar
int_fuzzer = GeneratorGrammarFuzzer(int_grammar_with_range(900, 1000))
[int_fuzzer.fuzz() for i in range(10)]
['930', '967', '938', '987', '959', '969', '984', '900', '941', '999']


The grammar for floating-point values closely resembles the integer grammar.

    "<start>": ["<float>"],
    "<float>": [("<_float>", opts(prob=0.9)), "inf", "NaN"],
    "<_float>": ["<int>(.<digit>+)?<exp>?"],
    "<exp>": ["e<int>"]
FLOAT_EBNF_GRAMMAR["<start>"] = ["<float>"]

assert is_valid_grammar(FLOAT_EBNF_GRAMMAR)
FLOAT_GRAMMAR = convert_ebnf_grammar(FLOAT_EBNF_GRAMMAR)
{'<start>': ['<float>'],
 '<float>': [('<_float>', {'prob': 0.9}), 'inf', 'NaN'],
 '<_float>': ['<int><symbol-2><exp-1>'],
 '<exp>': ['e<int>'],
 '<int>': ['<_int>'],
 '<_int>': ['<symbol-1-1><leaddigit><digit-1>'],
 '<leaddigit>': ['1', '2', '3', '4', '5', '6', '7', '8', '9'],
 '<digit>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
 '<symbol>': ['.<digit-2>'],
 '<symbol-1>': ['-'],
 '<symbol-2>': ['', '<symbol>'],
 '<exp-1>': ['', '<exp>'],
 '<symbol-1-1>': ['', '<symbol-1>'],
 '<digit-1>': ['', '<digit><digit-1>'],
 '<digit-2>': ['<digit>', '<digit><digit-2>']}
float_fuzzer = ProbabilisticGrammarFuzzer(FLOAT_GRAMMAR)
print([float_fuzzer.fuzz() for i in range(10)])
['880.32e-1017', '-7174.0e7132', '9e313', '8', '-579', '-65.420287', '-87118', '11.6', '-3.5e7', '2865']
def float_grammar_with_range(start, end):
    float_grammar = extend_grammar(FLOAT_GRAMMAR)
    set_opts(float_grammar, "<float>", "<_float>", opts(
        pre=lambda: start + random.random() * (end - start)))
    return float_grammar
float_fuzzer = ProbabilisticGeneratorGrammarFuzzer(
    float_grammar_with_range(900.0, 900.9))
[float_fuzzer.fuzz() for i in range(10)]


Finally, we introduce a grammar for producing strings.

    "<start>": ["<ascii-string>"],
    "<ascii-string>": ['"<ascii-chars>"'],
    "<ascii-chars>": [
        ("", opts(prob=0.05)),
    "<ascii-char>": crange(" ", "!") + [r'\"'] + crange("#", "~")

assert is_valid_grammar(ASCII_STRING_EBNF_GRAMMAR)
string_fuzzer = ProbabilisticGrammarFuzzer(ASCII_STRING_GRAMMAR)
print([string_fuzzer.fuzz() for i in range(10)])
['"kx/0%A/%c:\']bR5xU.cxA39D5[x"', '"1"', '"L~4;@|KR2kv]Q!bc^>+"', '"P5yJ3*YW8z\\"acpWtiK8HaJ]bb4B]C9iHB(Pr.p9[*p7F\\=ywovM<=tww8KU`BA49"', '"zZ0\\"d=D+#yMMT](Ps]){J:}F`_JkuiU2M1NRxZ}#OskP2&f)w0"', '"gp:wsmYM`i\'\\"e*"', '"ScV$!Q_y~}X?YBa[3lwFppwDs}:o<2"', '"%JFvjr"', '"u{"', '"EC"']

Synthesizing Composite Data

From basic data, as discussed above, we can also produce composite data in data structures such as sets or lists. We illustrate such generation on lists.


    "<start>": ["<list>"],
    "<list>": [
        ("[]", opts(prob=0.05)),
    "<list-objects>": [
        ("<list-object>", opts(prob=0.2)),
        "<list-object>, <list-objects>"
    "<list-object>": ["0"],

assert is_valid_grammar(LIST_EBNF_GRAMMAR)
LIST_GRAMMAR = convert_ebnf_grammar(LIST_EBNF_GRAMMAR)

Our list generator takes a grammar that produces objects; it then instantiates a list grammar with the objects from these grammars.

def list_grammar(object_grammar, list_object_symbol=None):
    obj_list_grammar = extend_grammar(LIST_GRAMMAR)
    if list_object_symbol is None:
        # Default: Use the first expansion of <start> as list symbol
        list_object_symbol = object_grammar[START_SYMBOL][0]

    obj_list_grammar[START_SYMBOL] = ["<list>"]
    obj_list_grammar["<list-object>"] = [list_object_symbol]

    assert is_valid_grammar(obj_list_grammar)

    return obj_list_grammar
int_list_fuzzer = ProbabilisticGrammarFuzzer(list_grammar(INT_GRAMMAR))
[int_list_fuzzer.fuzz() for i in range(10)]
['[3, 9]',
 '[-6, 47, -19, 5, -3, 1, -5, -413, 76, -4, 2, -98, 909, -509, 1]',
 '[-24, 89, 4, 6, 5, 5, 6, 1, 8]',
 '[-3, 403, -85]',
 '[2, 14]',
 '[4, -6, 3, 7, -9, 5, 1, -5, -2, 27991, 7, -7, 3, 9, 3, 3]',
 '[49782, 1]',
 '[447, 7, 1, 3, 7, 8]',
 '[8, 6]']
string_list_fuzzer = ProbabilisticGrammarFuzzer(
[string_list_fuzzer.fuzz() for i in range(10)]
['["r<rTOUC?X3Bn\\"OGt", "k1+wSqAeZai5pt`#uuU43X0\\"fxt`", "kB0:._Tg=x(A2({R\\",YDYf/IR@3Rd@1#\\pn*Lu/rW!TO!Uw5>q\'o:k"]',
 '["6r8Ci)#R: b^Bkb"]',
 '["7Kz", "1NJ1[taT_", "ge", "", "B", "", ""]',
 '["F;&4", "U&.", "M", "", "", ""]',
 '["2u4zz;d@j", ">h89R", "", "", ""]',
 '["", "EOd:", "zDt", "", "", ""]',
 '[">M)IQ", "UGdry %?1", "FUe", ">", "\\+"]',
 '["D1OQR tzep7:", "vgj`06Ft>+", ""]']
float_list_fuzzer = ProbabilisticGeneratorGrammarFuzzer(list_grammar(
    float_grammar_with_range(900.0, 900.9)))
[float_list_fuzzer.fuzz() for i in range(10)]
 '[900.6586938009385, 900.1464400151588]',
 '[900.5870984230792, NaN, 900.5992368058849]',
 '[900.7332185853884, 900.2330155096994, NaN, 900.3530406943618, 900.0391486744439]',
 '[900.3370542547488, 900.2617017382523, 900.2901034835673, 900.8889938351865, 900.2615386955117]',
 '[900.7128480393702, 900.2940932685525, 900.7256545239846, 900.2506315919088, 900.0564194116812, 900.6233377268288, inf, 900.0525066602173, 900.509096059324, 900.3377227393102, 900.5408047702748]',
 '[900.6506944163228, 900.4525626177693, 900.1865789425581, 900.1679928125902, 900.4515187503573, 900.0134415093579, 900.1959225955225, 900.2506600261744]',
 '[900.3556635906978, 900.8876446522335, 900.6444332077054]',
 '[900.6046932854488, 900.8789879882188, 900.5045363707462, 900.7545910175455, 900.4785616451219, 900.897948782847, 900.3745033546562, 900.2364202989435, 900.0731307485091, 900.8437453825687, NaN, 900.5044988370635, 900.5921575649608, 900.1068994359208]',

Generators for dictionaries, sets, etc. can be defined in a similar fashion. By plugging together grammar generators. we can produce data structures with arbitrary elements.

Lessons Learned

  • To fuzz individual functions, one can easily set up grammars that produce function calls.
  • Fuzzing at the API level can be much faster than fuzzing at the system level, but brings the risk of false alarms by violating implicit preconditions.

Next Steps

This chapter was all about manually writing test and controlling which data gets generated. In the next chapter, we will introduce a much higher level of automation:

  • Carving automatically records function calls and arguments from program executions.
  • We can turn these into grammars, allowing to test these functions with various combinations of recorded values.

With these techniques, we automatically obtain grammars that already invoke functions in application contexts, making our work of specifying them much easier.


The idea of using generator functions to generate input structures was first explored in QuickCheck [Claessen et al, 2000.]. A very nice implementation for Python is the hypothesis package which allows to write and combine data structure generators for testing APIs.


The exercises for this chapter combine the above techniques with fuzzing techniques introduced earlier.

Exercise 1: Deep Arguments

In the example generating oracles for urlparse(), important elements such as authority or port are not checked. Enrich URLPARSE_ORACLE_GRAMMAR with post-expansion functions that store the generated elements in a symbol table, such that they can be accessed when generating the assertions.

Exercise 2: Covering Argument Combinations

In the chapter on configuration testing, we also discussed combinatorial testing – that is, systematic coverage of sets of configuration elements. Implement a scheme that by changing the grammar, allows all pairs of argument values to be covered.

Exercise 3: Mutating Arguments

To widen the range of arguments to be used during testing, apply the mutation schemes introduced in mutation fuzzing – for instance, flip individual bytes or delete characters from strings. Apply this either during grammar inference or as a separate step when invoking functions.

Creative Commons License The content of this project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The source code that is part of the content, as well as the source code used to format and display that content is licensed under the MIT License. Last change: 2019-01-15 18:40:00+01:00CiteImprint