Fuzzing in the Large¶

In the past chapters, we have always looked at fuzzing taking place on one machine for a few seconds only. In the real world, however, fuzzers are run on dozens or even thousands of machines; for hours, days and weeks; for one program or dozens of programs. In such contexts, one needs an infrastructure to collect failure data from the individual fuzzer runs, and to aggregate such data in a central repository. In this chapter, we will examine such an infrastructure, the FuzzManager framework from Mozilla.

Prerequisites

  • This chapter requires basic knowledge on testing, e.g. from the Introduction to testing.
  • This chapter requires basic knowledge on how fuzzers fork, e.g. from the Introduction to fuzzing.

Synopsis¶

To use the code provided in this chapter, write

>>> from fuzzingbook.FuzzingInTheLarge import <identifier>

and then make use of the following features.

The Python FuzzManager package allows for programmatic submission of failures from a large number of (fuzzed) programs. One can query crashes and their details, collect them into buckets to ensure they will be treated the same, and also retrieve coverage information for debugging both programs and their tests.

Collecting Crashes from Multiple Fuzzers¶

So far, all our fuzzing scenarios have been one fuzzer on one machine testing one program. Failures would be shown immediately, and diagnosed quickly by the same person who started the fuzzer. Alas, testing in the real world is different. Fuzzing is still fully automated; but now, we are talking about multiple fuzzers running on multiple machines testing multiple programs (and versions thereof), producing multiple failures that have to be handled by multiple people. This raises the question of how to manage all these activities and their interplay.

A common means to coordinate several fuzzers is to have a central repository that collects all crashes as well as their crash information. Whenever a fuzzer detects a failure, it connects via the network to a crash server, which then stores the crash information in a database.

In [3]:
# ignore
from graphviz import Digraph
In [4]:
# ignore
g = Digraph()
server = 'Crash Server'
g.node('Crash Database', shape='cylinder')
for i in range(1, 7):
    g.edge('Fuzzer ' + repr(i), server)
g.edge(server, 'Crash Database')
g
Out[4]:
Crash Database Crash Database Fuzzer 1 Fuzzer 1 Crash Server Crash Server Fuzzer 1->Crash Server Crash Server->Crash Database Fuzzer 2 Fuzzer 2 Fuzzer 2->Crash Server Fuzzer 3 Fuzzer 3 Fuzzer 3->Crash Server Fuzzer 4 Fuzzer 4 Fuzzer 4->Crash Server Fuzzer 5 Fuzzer 5 Fuzzer 5->Crash Server Fuzzer 6 Fuzzer 6 Fuzzer 6->Crash Server

The resulting crash database can be queried to find out which failures have occurred – typically, using a Web interface. It can also be integrated with other process activities. Most importantly, entries in the crash database can be linked to the bug database, and vice versa, such that bugs (= crashes) can be assigned to individual developers.

In such an infrastructure, collecting crashes is not limited to fuzzers. Crashes and failures occurring in the wild can also be automatically reported to the crash server. In industry, it is not uncommon to have crash databases collecting thousands of crashes from production runs – especially if the software in question is used by millions of people every day.

What information is stored in such a database?

  • Most important is the identifier of the product – that is, the product name, version information as well as the platform and the operating system. Without this information, there is no way developers can tell whether the bug is still around in the latest version, or whether it already has been fixed.

  • For debugging, the most helpful information for developers are the steps to reproduce – in a fuzzing scenario, this would be the input to the program in question. (In a production scenario, the user's input is not collected for obvious privacy reasons.)

  • Second most helpful for debugging is a stack trace such that developers can inspect which internal functionality was active in the moment of the failure. A coverage map also comes in handy, since developers can query which functions were executed and which were not.

  • If general failures are collected, developers also need to know what the expected behavior was; for crashes, this is simple, as users do not expect their software to crash.

All of this information can be collected automatically if the fuzzer (or the program in question) is set up accordingly.

In this chapter, we will explore a platform that automates all these steps. The FuzzManager platform allows to

  1. collect failure data from failing runs,
  2. enter this data into a centralized server, and
  3. query the server via a Web interface.

In this chapter, we will show how to conduct basic steps with FuzzManager, including crash submission and triage as well as coverage measurement tasks.

Running a Crash Server¶

FuzzManager is a tool chain for managing large-scale fuzzing processes. It is modular in the sense that you can make use of those parts you need; it is versatile in the sense that it does not impose a particular process. It consists of a server whose task is to collect crash data, as well as of various collector utilities that collect crash data to send it to the server.

Logging In¶

Now that the server is up and running, FuzzManager can be reached on the local host using this URL.

In [29]:
fuzzmanager_url = "http://127.0.0.1:8000"

To log in, use the username demo and the password demo. In this notebook, we do this programmatically, using the Selenium interface introduced in the chapter on GUI fuzzing.

For an interactive session, set headless to False; then you can interact with FuzzManager at the same time you are interacting with this notebook.

In [33]:
gui_driver = start_webdriver(headless=True, zoom=1.2)
In [34]:
gui_driver.set_window_size(1400, 600)
In [35]:
gui_driver.get(fuzzmanager_url)

This is the starting screen of FuzzManager:

In [36]:
# ignore
Image(gui_driver.get_screenshot_as_png())
Out[36]:

We now log in by sending demo both as username and password, and then click on the Login button.

In [37]:
# ignore
from selenium.webdriver.common.by import By
In [38]:
# ignore
username = gui_driver.find_element(By.NAME, "username")
username.send_keys("demo")
In [39]:
# ignore
password = gui_driver.find_element(By.NAME, "password")
password.send_keys("demo")
In [40]:
# ignore
login = gui_driver.find_element(By.TAG_NAME, "button")
login.click()
time.sleep(1)

After login, we find an empty database. This is where crashes will appear, once we have collected them.

In [41]:
# ignore
Image(gui_driver.get_screenshot_as_png())
Out[41]:

Collecting Crashes¶

To fill our database, we need some crashes. Let us take a look at simply-buggy, an example repository containing trivial C++ programs for illustration purposes.

In [42]:
!git clone https://github.com/uds-se/simply-buggy
Cloning into 'simply-buggy'...
remote: Enumerating objects: 22, done.
remote: Total 22 (delta 0), reused 0 (delta 0), pack-reused 22
Receiving objects: 100% (22/22), 4.90 KiB | 4.90 MiB/s, done.
Resolving deltas: 100% (9/9), done.

The make command compiles our target program, including our first target, the simple-crash example. Alongside the program, there is also a configuration file generated.

In [43]:
!(cd simply-buggy && make)
clang++ -fsanitize=address -g -o maze maze.cpp
clang++ -fsanitize=address -g -o out-of-bounds out-of-bounds.cpp
clang++ -fsanitize=address -g -o simple-crash simple-crash.cpp

Let's take a look at the simple-crash source code in simple-crash.cpp. As you can see, the source code is fairly simple: A forced crash by writing to a (near)-NULL pointer. This should immediately crash on most machines.

In [44]:
# ignore
from bookutils import print_file
In [45]:
# ignore
print_file("simply-buggy/simple-crash.cpp")
/*
 * simple-crash - A simple NULL crash.
 *
 * WARNING: This program neither makes sense nor should you code like it is
 *          done in this program. It is purely for demo purposes and uses
 *          bad and meaningless coding habits on purpose.
 */

int crash() {
  int* p = (int*)0x1;
  *p = 0xDEADBEEF;
  return *p;
}

int main(int argc, char** argv) {
  return crash();
}

The configuration file simple-crash.fuzzmanagerconf generated for the the binary also contains some straightforward information, like the version of the program and other metadata that is required or at least useful later on when submitting crashes.

In [46]:
# ignore
print_file("simply-buggy/simple-crash.fuzzmanagerconf", lexer=IniLexer())
[Main]
platform = x86-64
product = simple-crash-simple-crash
product_version = 83038f74e812529d0fc172a718946fbec385403e
os = linux

[Metadata]
pathPrefix = /Users/zeller/Projects/fuzzingbook/notebooks/simply-buggy/
buildFlags = -fsanitize=address -g

Let us run the program! We immediately get a crash trace as expected:

In [47]:
!simply-buggy/simple-crash
AddressSanitizer:DEADLYSIGNAL
=================================================================
==35049==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x000104c4fee0 bp 0x00016b1b27b0 sp 0x00016b1b2770 T0)
==35049==The signal is caused by a UNKNOWN memory access.
==35049==Hint: address points to the zero page.
    #0 0x104c4fee0 in crash() simple-crash.cpp:11
    #1 0x104c4ff5c in main simple-crash.cpp:16
    #2 0x1a1c3fe4c  (<unknown module>)

==35049==Register values:
 x[0] = 0x0000000000000001   x[1] = 0x000000016b1b2ae8   x[2] = 0x000000016b1b2af8   x[3] = 0x000000016b1b2e50  
 x[4] = 0x000000016b1b22f8   x[5] = 0x00000001a1c68c2c   x[6] = 0x00000001feccacf0   x[7] = 0x0000000000000000  
 x[8] = 0x0000007000020000   x[9] = 0x00000000deadbeef  x[10] = 0x0000000000000001  x[11] = 0x0000000000000002  
x[12] = 0x000000016b1b278a  x[13] = 0x0000000000000001  x[14] = 0x0000000000000001  x[15] = 0xfffffffffffffffe  
x[16] = 0x000000000000004a  x[17] = 0x82ca0001feccad28  x[18] = 0x0000000000000000  x[19] = 0x0000000104c59ca0  
x[20] = 0x0000000104c4ff44  x[21] = 0x0000000104c59cc0  x[22] = 0x000000016b1b2970  x[23] = 0x00000001a1cb5000  
x[24] = 0x00000001fd7d7340  x[25] = 0x0000000000000000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000  
x[28] = 0x0000000000000000     fp = 0x000000016b1b27b0     lr = 0x0000000104c4ff60     sp = 0x000000016b1b2770  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV simple-crash.cpp:11 in crash()
==35049==ABORTING

Now, what we would actually like to do is to run this binary from Python instead, detect that it crashed, collect the trace and submit it to the server. Let's start with a simple script that would just run the program we give it and detect the presence of the ASan trace:

In [49]:
cmd = ["simply-buggy/simple-crash"]
In [50]:
result = subprocess.run(cmd, stderr=subprocess.PIPE)
stderr = result.stderr.decode().splitlines()
crashed = False

for line in stderr:
    if "ERROR: AddressSanitizer" in line:
        crashed = True
        break

if crashed:
    print("Yay, we crashed!")
else:
    print("Move along, nothing to see...")
Yay, we crashed!

With this script, we can now run the binary and indeed detect that it crashed. But how do we send this information to the crash server now? Let's add a few features from the FuzzManager toolbox.

Program Configurations¶

A ProgramConfiguration is largely a container class storing various properties of the program, e.g. product name, the platform, version and runtime options. By default, it reads the information from the .fuzzmanagerconf file created for the program under test.

In [51]:
sys.path.append('FuzzManager')
In [53]:
configuration = ProgramConfiguration.fromBinary('simply-buggy/simple-crash')
(configuration.product, configuration.platform)
Out[53]:
('simple-crash-simple-crash', 'x86-64')

Crash Info¶

A CrashInfo object stores all the necessary data about a crash, including

  • the stdout output of your program
  • the stderr output of your program
  • crash information as produced by GDB or AddressSanitizer
  • a ProgramConfiguration instance

Let's collect the information for the run of simply-crash:

In [55]:
cmd = ["simply-buggy/simple-crash"]
result = subprocess.run(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
In [56]:
stderr = result.stderr.decode().splitlines()
stderr[0:3]
Out[56]:
['AddressSanitizer:DEADLYSIGNAL',
 '=================================================================',
 '==35056==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x00010448bee0 bp 0x00016b976850 sp 0x00016b976810 T0)']
In [57]:
stdout = result.stdout.decode().splitlines()
stdout
Out[57]:
[]

This reads and parses our ASan trace into a more generic format, returning us a generic CrashInfo object that we can inspect and/or submit to the server:

In [58]:
crashInfo = CrashInfo.fromRawCrashData(stdout, stderr, configuration)
print(crashInfo)
Crash trace:

# 00    crash
# 01    main
# 02    <unknow

Crash address: 0x1

Last 5 lines on stderr:
x[24] = 0x00000001fd7d7340  x[25] = 0x0000000000000000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000  
x[28] = 0x0000000000000000     fp = 0x000000016b976850     lr = 0x000000010448bf60     sp = 0x000000016b976810  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV simple-crash.cpp:11 in crash()
==35056==ABORTING

Collector¶

The last step is to send the crash info to our crash manager. A Collector is a feature to communicate with a CrashManager server. Collector provides an easy client interface that allows your clients to submit crashes as well as download and match existing signatures to avoid reporting frequent issues repeatedly.

We instantiate the collector instance; this will be our entry point for talking to the server.

In [60]:
collector = Collector()

To submit the crash info, we use the collector's submit() method:

In [61]:
collector.submit(crashInfo)
Out[61]:
{'rawStdout': '',
 'rawStderr': 'AddressSanitizer:DEADLYSIGNAL\n=================================================================\n==35056==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x00010448bee0 bp 0x00016b976850 sp 0x00016b976810 T0)\n==35056==The signal is caused by a UNKNOWN memory access.\n==35056==Hint: address points to the zero page.\n    #0 0x10448bee0 in crash() simple-crash.cpp:11\n    #1 0x10448bf5c in main simple-crash.cpp:16\n    #2 0x1a1c3fe4c  (<unknown module>)\n\n==35056==Register values:\n x[0] = 0x0000000000000001   x[1] = 0x000000016b976b80   x[2] = 0x000000016b976b90   x[3] = 0x000000016b976ed8  \n x[4] = 0x000000016b976398   x[5] = 0x00000001a1c68c2c   x[6] = 0x00000001feccacf0   x[7] = 0x0000000000000000  \n x[8] = 0x0000007000020000   x[9] = 0x00000000deadbeef  x[10] = 0x0000000000000001  x[11] = 0x0000000000000002  \nx[12] = 0x000000016b97682a  x[13] = 0x0000000000000001  x[14] = 0x0000000000000001  x[15] = 0xfffffffffffffffe  \nx[16] = 0x000000000000004a  x[17] = 0x82ca0001feccad28  x[18] = 0x0000000000000000  x[19] = 0x0000000104495ca0  \nx[20] = 0x000000010448bf44  x[21] = 0x0000000104495cc0  x[22] = 0x000000016b976a10  x[23] = 0x00000001a1cb5000  \nx[24] = 0x00000001fd7d7340  x[25] = 0x0000000000000000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000  \nx[28] = 0x0000000000000000     fp = 0x000000016b976850     lr = 0x000000010448bf60     sp = 0x000000016b976810  \nAddressSanitizer can not provide additional info.\nSUMMARY: AddressSanitizer: SEGV simple-crash.cpp:11 in crash()\n==35056==ABORTING',
 'rawCrashData': '',
 'metadata': '{"pathPrefix": "/Users/zeller/Projects/fuzzingbook/notebooks/simply-buggy/", "buildFlags": "-fsanitize=address -g"}',
 'testcase_size': 0,
 'testcase_quality': 0,
 'testcase_isbinary': False,
 'platform': 'x86-64',
 'product': 'simple-crash-simple-crash',
 'product_version': '83038f74e812529d0fc172a718946fbec385403e',
 'os': 'linux',
 'client': 'Braeburn.fritz.box',
 'tool': 'fuzzingbook',
 'env': '',
 'args': '',
 'bucket': None,
 'id': 1,
 'shortSignature': '[@ crash]',
 'crashAddress': '0x1'}

Inspecting Crashes¶

We now submitted something to our local FuzzManager demo instance. If you run the crash server on your local machine, you can go to http://127.0.0.1:8000/crashmanager/crashes/ you should see the crash info just submitted. You can inquire the product, version, operating system, and further crash details.

In [62]:
# ignore
gui_driver.refresh()
In [63]:
# ignore
Image(gui_driver.get_screenshot_as_png())
Out[63]:

If you click on the crash ID, you can further inspect the submitted data.

In [64]:
# ignore
crash = gui_driver.find_element(By.XPATH, '//td/a[contains(@href,"/crashmanager/crashes/")]')
crash.click()
time.sleep(1)
In [65]:
# ignore
Image(gui_driver.get_screenshot_as_png())
Out[65]: