In the chapter on grammars, we have seen how to use *grammars* for very effective and efficient testing. In this chapter, we refine the previous *string-based* algorithm into a *tree-based* algorithm, which is much faster and allows for much more control over the production of fuzz inputs.

In the previous chapter, we have introduced the `simple_grammar_fuzzer()`

function which takes a grammar and automatically produces a syntactically valid string from it. However, `simple_grammar_fuzzer()`

is just what its name suggests – simple. To illustrate the problem, let us get back to the `expr_grammar`

we created from `EXPR_GRAMMAR_BNF`

in the chapter on grammars:

In [7]:

```
expr_grammar = convert_ebnf_grammar(EXPR_EBNF_GRAMMAR)
expr_grammar
```

Out[7]:

{'<start>': ['<expr>'], '<expr>': ['<term> + <expr>', '<term> - <expr>', '<term>'], '<term>': ['<factor> * <term>', '<factor> / <term>', '<factor>'], '<factor>': ['<sign-1><factor>', '(<expr>)', '<integer><symbol-1>'], '<sign>': ['+', '-'], '<integer>': ['<digit-1>'], '<digit>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], '<symbol>': ['.<integer>'], '<sign-1>': ['', '<sign>'], '<symbol-1>': ['', '<symbol>'], '<digit-1>': ['<digit>', '<digit><digit-1>']}

`expr_grammar`

has an interesting property. If we feed it into `simple_grammar_fuzzer()`

, the function gets stuck:

In [9]:

```
with ExpectTimeout(1):
simple_grammar_fuzzer(grammar=expr_grammar, max_nonterminals=3)
```

`simple_grammar_fuzzer()`

; and run `simple_grammar_fuzzer()`

with `log=true`

argument to see the expansions.

In [10]:

```
quiz("Why does `simple_grammar_fuzzer()` hang?",
[
"It produces an infinite number of additions",
"It produces an infinite number of digits",
"It produces an infinite number of parentheses",
"It produces an infinite number of signs",
], '(3 * 3 * 3) ** (3 / (3 * 3))')
```

Out[10]:

Why does

`simple_grammar_fuzzer()`

hang?

Indeed! The problem is in this rule:

In [11]:

```
expr_grammar['<factor>']
```

Out[11]:

['<sign-1><factor>', '(<expr>)', '<integer><symbol-1>']

`(expr)`

increases the number of symbols, even if only temporary. Since we place a hard limit on the number of symbols to expand, the only choice left for expanding `<factor>`

is `(<expr>)`

, which leads to an *infinite addition of parentheses.*

The problem of potentially infinite expansion is only one of the problems with `simple_grammar_fuzzer()`

. More problems include:

*It is inefficient*. With each iteration, this fuzzer would go search the string produced so far for symbols to expand. This becomes inefficient as the production string grows.*It is hard to control.*Even while limiting the number of symbols, it is still possible to obtain very long strings – and even infinitely long ones, as discussed above.

Let us illustrate both problems by plotting the time required for strings of different lengths.

In [16]:

```
trials = 50
xs = []
ys = []
for i in range(trials):
with Timer() as t:
s = simple_grammar_fuzzer(EXPR_GRAMMAR, max_nonterminals=15)
xs.append(len(s))
ys.append(t.elapsed_time())
print(i, end=" ")
print()
```

In [17]:

```
average_time = sum(ys) / trials
print("Average time:", average_time)
```

Average time: 0.15491187840088969

In [18]:

```
%matplotlib inline
import matplotlib.pyplot as plt
plt.scatter(xs, ys)
plt.title('Time required for generating an output');
```

*smarter algorithm* – one that is more efficient, that gets us better control over expansions, and that is able to foresee in `expr_grammar`

that the `(expr)`

alternative yields a potentially infinite expansion, in contrast to the other two.

To both obtain a more efficient algorithm *and* exercise better control over expansions, we will use a special representation for the strings that our grammar produces. The general idea is to use a *tree* structure that will be subsequently expanded – a so-called *derivation tree*. This representation allows us to always keep track of our expansion status – answering questions such as which elements have been expanded into which others, and which symbols still need to be expanded. Furthermore, adding new elements to a tree is far more efficient than replacing strings again and again.

*parse tree* or *concrete syntax tree*) consists of *nodes* which have other nodes (called *child nodes*) as their *children*. The tree starts with one node that has no parent; this is called the *root node*; a node without children is called a *leaf*.

*start symbol* – in our case `<start>`

.

In [21]:

```
# ignore
tree
```

Out[21]:

`<start>`

, the only expansion is `<expr>`

, so we add it as a child.

In [23]:

```
# ignore
tree
```

Out[23]:

To construct the produced string from a derivation tree, we traverse the tree in order and collect the symbols at the leaves of the tree. In the case above, we obtain the string `"<expr>"`

.

To further expand the tree, we choose another symbol to expand, and add its expansion as new children. This would get us the `<expr>`

symbol, which gets expanded into `<expr> + <term>`

, adding three children.

In [25]:

```
# ignore
tree
```

Out[25]:

We repeat the expansion until there are no symbols left to expand:

In [27]:

```
# ignore
tree
```

Out[27]:

`2 + 2`

. In contrast to the string alone, though, the derivation tree records *the entire structure* (and production history, or *derivation* history) of the produced string. It also allows for simple comparison and manipulation – say, replacing one subtree (substructure) against another.

To represent a derivation tree in Python, we use the following format. A node is a pair

```
(SYMBOL_NAME, CHILDREN)
```

where `SYMBOL_NAME`

is a string representing the node (i.e. `"<start>"`

or `"+"`

) and `CHILDREN`

is a list of children nodes.

`CHILDREN`

can take some special values:

`None`

as a placeholder for future expansion. This means that the node is a*nonterminal symbol*that should be expanded further.`[]`

(i.e., the empty list) to indicate*no*children. This means that the node is a*terminal symbol*that can no longer be expanded.

`DerivationTree`

captures this very structure. (`Any`

should actually read `DerivationTree`

, but the Python static type checker cannot handle recursive types well.)

In [28]:

```
DerivationTree = Tuple[str, Optional[List[Any]]]
```

`<expr> + <term>`

, above.

In [29]:

```
derivation_tree: DerivationTree = ("<start>",
[("<expr>",
[("<expr>", None),
(" + ", []),
("<term>", None)]
)])
```

`display_tree()`

that visualizes this tree.

This is what our tree visualizes into:

In [47]:

```
display_tree(derivation_tree)
```

Out[47]:

In [48]:

```
quiz("And which of these is the internal representation of `derivation_tree`?",
[
"`('<start>', [('<expr>', (['<expr> + <term>']))])`",
"`('<start>', [('<expr>', (['<expr>', ' + ', <term>']))])`",
"`" + repr(derivation_tree) + "`",
"`(" + repr(derivation_tree) + ", None)`"
], len("eleven") - len("one"))
```

Out[48]:

And which of these is the internal representation of

`derivation_tree`

?

You can check it out yourself:

In [49]:

```
derivation_tree
```

Out[49]:

('<start>', [('<expr>', [('<expr>', None), (' + ', []), ('<term>', None)])])

`display_annotated_tree()`

which allows adding annotations to individual nodes.

`all_terminals()`

function comes in handy:

In [52]:

```
def all_terminals(tree: DerivationTree) -> str:
(symbol, children) = tree
if children is None:
# This is a nonterminal symbol not expanded yet
return symbol
if len(children) == 0:
# This is a terminal symbol
return symbol
# This is an expanded symbol:
# Concatenate all terminal symbols from all children
return ''.join([all_terminals(c) for c in children])
```

In [53]:

```
all_terminals(derivation_tree)
```

Out[53]:

'<expr> + <term>'

`tree_to_string()`

function also converts the tree to a string; however, it replaces nonterminal symbols by empty strings.

In [54]:

```
def tree_to_string(tree: DerivationTree) -> str:
symbol, children, *_ = tree
if children:
return ''.join(tree_to_string(c) for c in children)
else:
return '' if is_nonterminal(symbol) else symbol
```

In [55]:

```
tree_to_string(derivation_tree)
```

Out[55]:

' + '

`derivation_tree`

, above), and expands all these symbols one after the other. As with earlier fuzzers, we create a special subclass of `Fuzzer`

– in this case, `GrammarFuzzer`

. A `GrammarFuzzer`

gets a grammar and a start symbol; the other parameters will be used later to further control creation and to support debugging.

In [57]:

```
class GrammarFuzzer(Fuzzer):
"""Produce strings from grammars efficiently, using derivation trees."""
def __init__(self,
grammar: Grammar,
start_symbol: str = START_SYMBOL,
min_nonterminals: int = 0,
max_nonterminals: int = 10,
disp: bool = False,
log: Union[bool, int] = False) -> None:
"""Produce strings from `grammar`, starting with `start_symbol`.
If `min_nonterminals` or `max_nonterminals` is given, use them as limits
for the number of nonterminals produced.
If `disp` is set, display the intermediate derivation trees.
If `log` is set, show intermediate steps as text on standard output."""
self.grammar = grammar
self.start_symbol = start_symbol
self.min_nonterminals = min_nonterminals
self.max_nonterminals = max_nonterminals
self.disp = disp
self.log = log
self.check_grammar() # Invokes is_valid_grammar()
```

To add further methods to `GrammarFuzzer`

, we use the hack already introduced for the `MutationFuzzer`

class. The construct

```
class GrammarFuzzer(GrammarFuzzer):
def new_method(self, args):
pass
```

allows us to add a new method `new_method()`

to the `GrammarFuzzer`

class. (Actually, we get a new `GrammarFuzzer`

class that extends the old one, but for all our purposes, this does not matter.)

Let us now define a helper method `init_tree()`

that constructs a tree with just the start symbol:

In [59]:

```
class GrammarFuzzer(GrammarFuzzer):
def init_tree(self) -> DerivationTree:
return (self.start_symbol, None)
```

In [60]:

```
f = GrammarFuzzer(EXPR_GRAMMAR)
display_tree(f.init_tree())
```

Out[60]:

This is the tree we want to expand.

`GrammarFuzzer`

is `choose_node_expansion()`

. This method gets a node (say, the `<start>`

node) and a list of possible lists of children to be expanded (one for every possible expansion from the grammar), chooses one of them, and returns its index in the possible children list.

In [61]:

```
class GrammarFuzzer(GrammarFuzzer):
def choose_node_expansion(self, node: DerivationTree,
children_alternatives: List[List[DerivationTree]]) -> int:
"""Return index of expansion in `children_alternatives` to be selected.
'children_alternatives`: a list of possible children for `node`.
Defaults to random. To be overloaded in subclasses."""
return random.randrange(0, len(children_alternatives))
```

`expansion_to_children()`

that takes an expansion string and decomposes it into a list of derivation trees – one for each symbol (terminal or nonterminal) in the string.

In [63]:

```
expansion_to_children("<term> + <expr>")
```

Out[63]:

[('<term>', None), (' + ', []), ('<expr>', None)]

*epsilon expansion*, i.e. expanding into an empty string as in `<symbol> ::=`

needs special treatment:

In [64]:

```
expansion_to_children("")
```

Out[64]:

[('', [])]

`nonterminals()`

in the chapter on Grammars, we provide for future extensions, allowing the expansion to be a tuple with extra data (which will be ignored).

In [65]:

```
expansion_to_children(("+<term>", {"extra_data": 1234}))
```

Out[65]:

[('+', []), ('<term>', None)]

We realize this helper as a method in `GrammarFuzzer`

such that it can be overloaded by subclasses:

In [66]:

```
class GrammarFuzzer(GrammarFuzzer):
def expansion_to_children(self, expansion: Expansion) -> List[DerivationTree]:
return expansion_to_children(expansion)
```

With this, we can now take

- some non-expanded node in the tree,
- choose a random expansion, and
- return the new tree.

This is what the method `expand_node_randomly()`

does.

This is how `expand_node_randomly()`

works:

In [71]:

```
f = GrammarFuzzer(EXPR_GRAMMAR, log=True)
print("Before expand_node_randomly():")
expr_tree = ("<integer>", None)
display_tree(expr_tree)
```

Before expand_node_randomly():

Out[71]:

In [72]:

```
print("After expand_node_randomly():")
expr_tree = f.expand_node_randomly(expr_tree)
display_tree(expr_tree)
```

After expand_node_randomly(): Expanding <integer> randomly

Out[72]:

In [73]:

```
# docassert
assert expr_tree[1][0][0] == '<digit>'
```

In [74]:

```
quiz("What tree do we get if we expand the `<digit>` subtree?",
[
"We get another `<digit>` as new child of `<digit>`",
"We get some digit as child of `<digit>`",
"We get another `<digit>` as second child of `<integer>`",
"The entire tree becomes a single node with a digit"
], 'len("2") + len("2")')
```

Out[74]:

What tree do we get if we expand the

`<digit>`

subtree?

We can surely put this to the test, right? Here we go:

In [75]:

```
digit_subtree = expr_tree[1][0] # type: ignore
display_tree(digit_subtree)
```

Out[75]:

In [76]:

```
print("After expanding the <digit> subtree:")
digit_subtree = f.expand_node_randomly(digit_subtree)
display_tree(digit_subtree)
```

After expanding the <digit> subtree: Expanding <digit> randomly

Out[76]:

`<digit>`

gets expanded again according to the grammar rules – namely, into a single digit.

In [77]:

```
quiz("Is the original `expr_tree` affected by this change?",
[
"No, it is unchanged",
"Yes, it has also gained a new child"
], "1 ** (1 - 1)")
```

Out[77]:

Is the original

`expr_tree`

affected by this change?

Although we have changed one of the subtrees, the original `expr_tree`

is unaffected:

In [78]:

```
display_tree(expr_tree)
```

Out[78]:

`expand_node_randomly()`

returns a new (expanded) tree and does not change the tree passed as argument.

Let us now apply our functions for expanding a single node to some node in the tree. To this end, we first need to *search the tree for non-expanded nodes*. `possible_expansions()`

counts how many unexpanded symbols there are in a tree:

In [79]:

```
class GrammarFuzzer(GrammarFuzzer):
def possible_expansions(self, node: DerivationTree) -> int:
(symbol, children) = node
if children is None:
return 1
return sum(self.possible_expansions(c) for c in children)
```

In [80]:

```
f = GrammarFuzzer(EXPR_GRAMMAR)
print(f.possible_expansions(derivation_tree))
```

2

The method `any_possible_expansions()`

returns True if the tree has any non-expanded nodes.

In [81]:

```
class GrammarFuzzer(GrammarFuzzer):
def any_possible_expansions(self, node: DerivationTree) -> bool:
(symbol, children) = node
if children is None:
return True
return any(self.any_possible_expansions(c) for c in children)
```

In [82]:

```
f = GrammarFuzzer(EXPR_GRAMMAR)
f.any_possible_expansions(derivation_tree)
```

Out[82]:

True

`expand_tree_once()`

, the core method of our tree expansion algorithm. It first checks whether it is currently being applied on a nonterminal symbol without expansion; if so, it invokes `expand_node()`

on it, as discussed above.

Let us illustrate how `expand_tree_once()`

works. We start with our derivation tree from above...

In [84]:

```
derivation_tree = ("<start>",
[("<expr>",
[("<expr>", None),
(" + ", []),
("<term>", None)]
)])
display_tree(derivation_tree)
```

Out[84]:

... and now expand it twice:

In [85]:

```
f = GrammarFuzzer(EXPR_GRAMMAR, log=True)
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <term> randomly

Out[85]:

In [86]:

```
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <expr> randomly

Out[86]:

With `expand_tree_once()`

, we can keep on expanding the tree – but how do we actually stop? The key idea here, introduced by Luke in \cite{Luke2000}, is that after inflating the derivation tree to some maximum size, we *only want to apply expansions that increase the size of the tree by a minimum*. For `<factor>`

, for instance, we would prefer an expansion into `<integer>`

, as this will not introduce further recursion (and potential size inflation); for `<integer>`

, likewise, an expansion into `<digit>`

is preferred, as it will less increase tree size than `<digit><integer>`

.

To identify the *cost* of expanding a symbol, we introduce two functions that mutually rely on each other:

`symbol_cost()`

returns the minimum cost of all expansions of a symbol, using`expansion_cost()`

to compute the cost for each expansion.`expansion_cost()`

returns the sum of all expansions in`expansions`

. If a nonterminal is encountered again during traversal, the cost of the expansion is $\infty$, indicating (potentially infinite) recursion.

In [88]:

```
f = GrammarFuzzer(EXPR_GRAMMAR)
assert f.symbol_cost("<digit>") == 1
```

`<expr>`

, though, is five, as this is the minimum number of expansions required. (`<expr>`

$\rightarrow$ `<term>`

$\rightarrow$ `<factor>`

$\rightarrow$ `<integer>`

$\rightarrow$ `<digit>`

$\rightarrow$ 1)

In [89]:

```
assert f.symbol_cost("<expr>") == 5
```

`expand_node_by_cost(self, node, choose)`

, a variant of `expand_node()`

that takes the above cost into account. It determines the minimum cost `cost`

across all children and then chooses a child from the list using the `choose`

function, which by default is the minimum cost. If multiple children all have the same minimum cost, it chooses randomly between these.

`expand_node_min_cost()`

passes `min()`

as the `choose`

function, which makes it expand nodes at minimum cost.

In [91]:

```
class GrammarFuzzer(GrammarFuzzer):
def expand_node_min_cost(self, node: DerivationTree) -> DerivationTree:
if self.log:
print("Expanding", all_terminals(node), "at minimum cost")
return self.expand_node_by_cost(node, min)
```

`expand_tree_once()`

with the above `expand_node_min_cost()`

as expansion function.

In [92]:

```
class GrammarFuzzer(GrammarFuzzer):
def expand_node(self, node: DerivationTree) -> DerivationTree:
return self.expand_node_min_cost(node)
```

In [93]:

```
f = GrammarFuzzer(EXPR_GRAMMAR, log=True)
display_tree(derivation_tree)
```

Out[93]:

In [94]:

```
# docassert
assert f.any_possible_expansions(derivation_tree)
```

In [95]:

```
if f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <factor> at minimum cost

Out[95]:

In [96]:

```
# docassert
assert f.any_possible_expansions(derivation_tree)
```

In [97]:

```
if f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <integer> at minimum cost

Out[97]:

In [98]:

```
# docassert
assert f.any_possible_expansions(derivation_tree)
```

In [99]:

```
if f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <digit> at minimum cost

Out[99]:

We keep on expanding until all nonterminals are expanded.

In [100]:

```
while f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
```

Here is the final tree:

In [101]:

```
display_tree(derivation_tree)
```

Out[101]:

`expand_node_min_cost()`

chooses an expansion that does not increase the number of symbols, eventually closing all open expansions.

Especially at the beginning of an expansion, we may be interested in getting *as many nodes as possible* – that is, we'd like to prefer expansions that give us *more* nonterminals to expand. This is actually the exact opposite of what `expand_node_min_cost()`

gives us, and we can implement a method `expand_node_max_cost()`

that will always choose among the nodes with the *highest* cost:

In [102]:

```
class GrammarFuzzer(GrammarFuzzer):
def expand_node_max_cost(self, node: DerivationTree) -> DerivationTree:
if self.log:
print("Expanding", all_terminals(node), "at maximum cost")
return self.expand_node_by_cost(node, max)
```

`expand_node_max_cost()`

, we can again redefine `expand_node()`

to use it, and then use `expand_tree_once()`

to show a few expansion steps:

In [103]:

```
class GrammarFuzzer(GrammarFuzzer):
def expand_node(self, node: DerivationTree) -> DerivationTree:
return self.expand_node_max_cost(node)
```

In [104]:

```
derivation_tree = ("<start>",
[("<expr>",
[("<expr>", None),
(" + ", []),
("<term>", None)]
)])
```

In [105]:

```
f = GrammarFuzzer(EXPR_GRAMMAR, log=True)
display_tree(derivation_tree)
```

Out[105]:

In [106]:

```
# docassert
assert f.any_possible_expansions(derivation_tree)
```

In [107]:

```
if f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <expr> at maximum cost

Out[107]:

In [108]:

```
# docassert
assert f.any_possible_expansions(derivation_tree)
```

In [109]:

```
if f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <term> at maximum cost

Out[109]:

In [110]:

```
# docassert
assert f.any_possible_expansions(derivation_tree)
```

In [111]:

```
if f.any_possible_expansions(derivation_tree):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
```

Expanding <term> at maximum cost

Out[111]:

We can now put all three phases together in a single function `expand_tree()`

which will work as follows:

**Max cost expansion.**Expand the tree using expansions with maximum cost until we have at least`min_nonterminals`

nonterminals. This phase can be easily skipped by setting`min_nonterminals`

to zero.**Random expansion.**Keep on expanding the tree randomly until we reach`max_nonterminals`

nonterminals.**Min cost expansion.**Close the expansion with minimum cost.

We implement these three phases by having `expand_node`

reference the expansion method to apply. This is controlled by setting `expand_node`

(the method reference) to first `expand_node_max_cost`

(i.e., calling `expand_node()`

invokes `expand_node_max_cost()`

), then `expand_node_randomly`

, and finally `expand_node_min_cost`

. In the first two phases, we also set a maximum limit of `min_nonterminals`

and `max_nonterminals`

, respectively.

Let us try this out on our example. We start with a half-expanded derivation tree:

In [113]:

```
initial_derivation_tree: DerivationTree = ("<start>",
[("<expr>",
[("<expr>", None),
(" + ", []),
("<term>", None)]
)])
```

In [114]:

```
display_tree(initial_derivation_tree)
```

Out[114]:

In [115]:

```
f = GrammarFuzzer(
EXPR_GRAMMAR,
min_nonterminals=3,
max_nonterminals=5,
log=True)
derivation_tree = f.expand_tree(initial_derivation_tree)
```

This is the final derivation tree:

In [116]:

```
display_tree(derivation_tree)
```

Out[116]:

And this is the resulting string:

In [117]:

```
all_terminals(derivation_tree)
```

Out[117]:

'4 + 7 + 7.3 / 9'

`fuzz()`

that – like `simple_grammar_fuzzer()`

– simply takes a grammar and produces a string from it. It thus no longer exposes the complexity of derivation trees.

In [118]:

```
class GrammarFuzzer(GrammarFuzzer):
def fuzz_tree(self) -> DerivationTree:
"""Produce a derivation tree from the grammar."""
tree = self.init_tree()
# print(tree)
# Expand all nonterminals
tree = self.expand_tree(tree)
if self.log:
print(repr(all_terminals(tree)))
if self.disp:
display(display_tree(tree))
return tree
def fuzz(self) -> str:
"""Produce a string from the grammar."""
self.derivation_tree = self.fuzz_tree()
return all_terminals(self.derivation_tree)
```

We can now apply this on all our defined grammars (and visualize the derivation tree along)

In [119]:

```
f = GrammarFuzzer(EXPR_GRAMMAR)
f.fuzz()
```

Out[119]:

'+31.14 * -9 * -+(++(-(0.98 - 0 - 7) - +-+1.7 - -6 + 3 * 4)) * 5.0 + 70'

After calling `fuzz()`

, the produced derivation tree is accessible in the `derivation_tree`

attribute:

In [120]:

```
display_tree(f.derivation_tree)
```

Out[120]:

Let us try out the grammar fuzzer (and its trees) on other grammar formats.

In [121]:

```
f = GrammarFuzzer(URL_GRAMMAR)
f.fuzz()
```

Out[121]:

'https://www.google.com:63/x01?x71=81&x04=x51&abc=2'

In [122]:

```
display_tree(f.derivation_tree)
```

Out[122]:

In [123]:

```
f = GrammarFuzzer(CGI_GRAMMAR, min_nonterminals=3, max_nonterminals=5)
f.fuzz()
```

Out[123]:

'4%ca5%c3'

In [124]:

```
display_tree(f.derivation_tree)
```

Out[124]:

How do we stack up against `simple_grammar_fuzzer()`

?

In [125]:

```
trials = 50
xs = []
ys = []
f = GrammarFuzzer(EXPR_GRAMMAR, max_nonterminals=20)
for i in range(trials):
with Timer() as t:
s = f.fuzz()
xs.append(len(s))
ys.append(t.elapsed_time())
print(i, end=" ")
print()
```

In [126]:

```
average_time = sum(ys) / trials
print("Average time:", average_time)
```

Average time: 0.01903207001829287

In [127]:

```
%matplotlib inline
import matplotlib.pyplot as plt
plt.scatter(xs, ys)
plt.title('Time required for generating an output');
```

`GrammarFuzzer`

work with `expr_grammar`

, where `simple_grammar_fuzzer()`

failed? It works without any issue:

In [128]:

```
f = GrammarFuzzer(expr_grammar, max_nonterminals=10)
f.fuzz()
```

Out[128]:

'5.5 * (6 / 4 + (2 - 5) * 6 / 1 * 0 + 1 + 8)'

`GrammarFuzzer`

, we now have a solid foundation on which to build further fuzzers and illustrate more exciting concepts from the world of generating software tests. Many of these do not even require writing a grammar – instead, they *infer* a grammar from the domain at hand, and thus allow using grammar-based fuzzing even without writing a grammar. Stay tuned!

`GrammarFuzzer`

, an efficient grammar fuzzer that takes a grammar to produce syntactically valid input strings. Here's a typical usage:

In [130]:

```
phone_fuzzer = GrammarFuzzer(US_PHONE_GRAMMAR)
phone_fuzzer.fuzz()
```

Out[130]:

'(694)767-9530'

`GrammarFuzzer`

constructor takes a number of keyword arguments to control its behavior. `start_symbol`

, for instance, allows setting the symbol that expansion starts with (instead of `<start>`

):

In [131]:

```
area_fuzzer = GrammarFuzzer(US_PHONE_GRAMMAR, start_symbol='<area>')
area_fuzzer.fuzz()
```

Out[131]:

'403'

Here's how to parameterize the `GrammarFuzzer`

constructor:

In [132]:

```
# ignore
import inspect
```

In [133]:

```
# ignore
print(inspect.getdoc(GrammarFuzzer.__init__))
```

In [134]:

```
# ignore
from ClassDiagram import display_class_hierarchy
```

In [135]:

```
# ignore
display_class_hierarchy([GrammarFuzzer],
public_methods=[
Fuzzer.__init__,
Fuzzer.fuzz,
Fuzzer.run,
Fuzzer.runs,
GrammarFuzzer.__init__,
GrammarFuzzer.fuzz,
GrammarFuzzer.fuzz_tree,
],
types={
'DerivationTree': DerivationTree,
'Expansion': Expansion,
'Grammar': Grammar
},
project='fuzzingbook')
```

Out[135]: