In this chapter, we explore how to generate tests for Graphical User Interfaces (GUIs), abstracting from our previous examples on Web testing. Building on general means to extract user interface elements and activate them, our techniques generalize to arbitrary graphical user interfaces, from rich Web applications to mobile apps, and systematically explore user interfaces through forms and navigation elements.
Prerequisites
In the chapter on Web testing, we have shown how to test Web-based interfaces by directly interacting with a Web server using the HTTP protocol, and processing the retrieved HTML pages to identify user interface elements. While these techniques work well for user interfaces that are based on HTML only, they fail as soon as there are interactive elements that use JavaScript to execute code within the browser, and generate and change the user interface without having to interact with the browser.
In this chapter, we therefore take a different approach to user interface testing. Rather than using HTTP and HTML as the mechanisms for interaction, we leverage a dedicated UI testing framework, which allows us to
Although we will again illustrate our approach using a Web server, the approach easily generalizes to arbitrary user interfaces. In fact, the UI testing framework we use, Selenium, also comes in variants that run for Android apps.
As in the chapter on Web testing, we run a Web server that allows us to order products.
# ignore
if 'CI' in os.environ:
# Can't run this in our continuous environment,
# since it can't run a headless Web browser
sys.exit(0)
db = init_db()
This is the address of our web server:
httpd_process, httpd_url = start_httpd()
print_url(httpd_url)
Using webbrowser()
, we can retrieve the HTML of the home page, and use HTML()
to render it.
HTML(webbrowser(httpd_url))
127.0.0.1 - - [30/Jun/2024 18:54:32] "GET / HTTP/1.1" 200 -
Let us take a look at the GUI above. In contrast to the chapter on Web testing, we do not assume we can access the HTML source of the current page. All we assume is that there is a set of user interface elements we can interact with.
Selenium is a framework for testing Web applications by automating interaction in the browser. Selenium provides an API that allows one to launch a Web browser, query the state of the user interface, and interact with individual user interface elements. The Selenium API is available in a number of languages; we use the Selenium API for Python.
A Selenium web driver is the interface between a program and a browser controlled by the program. The following code starts a Web browser in the background, which we then control through the web driver.
We support both Firefox and Google Chrome.
BROWSER = 'firefox' # Set to 'chrome' if you prefer Chrome
For Firefox, you have to make sure the geckodriver program is in your path.
if BROWSER == 'firefox':
assert shutil.which('geckodriver') is not None, \
"Please install the 'geckodriver' executable " \
"from https://github.com/mozilla/geckodriver/releases"
For Chrome, you may have to make sure the chromedriver program is in your path.
if BROWSER == 'chrome':
assert shutil.which('chromedriver') is not None, \
"Please install the 'chromedriver' executable " \
"from https://chromedriver.chromium.org"
The browser is headless, meaning that it does not show on the screen.
HEADLESS = True
Note: If the notebook server runs locally (i.e. on the same machine on which you are seeing this), you can also set HEADLESS
to False
and see what happens right on the screen as you execute the notebook cells. This is very much recommended for interactive sessions.
This code starts the Selenium web driver.
def start_webdriver(browser=BROWSER, headless=HEADLESS, zoom=1.4):
# Set headless option
if browser == 'firefox':
options = webdriver.FirefoxOptions()
if headless:
# See https://www.browserstack.com/guide/firefox-headless
options.add_argument("--headless")
elif browser == 'chrome':
options = webdriver.ChromeOptions()
if headless:
# See https://www.selenium.dev/blog/2023/headless-is-going-away/
options.add_argument("--headless=new")
else:
assert False, "Select 'firefox' or 'chrome' as browser"
# Start the browser, and obtain a _web driver_ object such that we can interact with it.
if browser == 'firefox':
# For firefox, set a higher resolution for our screenshots
options.set_preference("layout.css.devPixelsPerPx", repr(zoom))
gui_driver = webdriver.Firefox(options=options)
# We set the window size such that it fits our order form exactly;
# this is useful for not wasting too much space when taking screen shots.
gui_driver.set_window_size(700, 300)
elif browser == 'chrome':
gui_driver = webdriver.Chrome(options=options)
gui_driver.set_window_size(700, 210 if headless else 340)
return gui_driver
gui_driver = start_webdriver(browser=BROWSER, headless=HEADLESS)
We can now interact with the browser programmatically. First, we have it navigate to the URL of our Web server:
gui_driver.get(httpd_url)
We see that the home page is actually accessed, together with a (failing) request to get a page icon:
print_httpd_messages()
127.0.0.1 - - [30/Jun/2024 18:54:37] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [30/Jun/2024 18:54:37] "GET /favicon.ico HTTP/1.1" 404 -
To see what the "headless" browser displays, we can obtain a screenshot. We see that it actually displays the home page.
Image(gui_driver.get_screenshot_as_png())
To interact with the Web page through Selenium and the browser, we can query Selenium for individual elements. For instance, we can access the UI element whose name
attribute (as defined in HTML) is "name"
.
name = gui_driver.find_element(By.NAME, "name")
Once we have an element, we can interact with it. Since name
is a text field, we can send it a string using the send_keys()
method; the string will be translated into appropriate keystrokes.
name.send_keys("Jane Doe")
In the screenshot, we can see that the name
field is now filled:
Image(gui_driver.get_screenshot_as_png())
Similarly, we can fill out the email, city, and ZIP fields:
email = gui_driver.find_element(By.NAME, "email")
email.send_keys("j.doe@example.com")
city = gui_driver.find_element(By.NAME, 'city')
city.send_keys("Seattle")
zip = gui_driver.find_element(By.NAME, 'zip')
zip.send_keys("98104")
Image(gui_driver.get_screenshot_as_png())
The check box for terms and conditions is not filled out, but clicked instead using the click()
method.
terms = gui_driver.find_element(By.NAME, 'terms')
terms.click()
Image(gui_driver.get_screenshot_as_png())
The form is now fully filled out. By clicking on the submit
button, we can place the order:
submit = gui_driver.find_element(By.NAME, 'submit')
submit.click()
We see that the order is being processed, and that the Web browser has switched to the confirmation page.
print_httpd_messages()
127.0.0.1 - - [30/Jun/2024 18:54:38] INSERT INTO orders VALUES ('tshirt', 'Jane Doe', 'j.doe@example.com', 'Seattle', '98104')
127.0.0.1 - - [30/Jun/2024 18:54:38] "GET /order?item=tshirt&name=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=98104&terms=on&submit=Place+order HTTP/1.1" 200 -
Image(gui_driver.get_screenshot_as_png())
Just as we fill out forms, we can also navigate through a website by clicking on links. Let us go back to the home page:
gui_driver.back()
Image(gui_driver.get_screenshot_as_png())
We can query the web driver for all elements of a particular type. Querying for HTML anchor elements (<a>
) for instance, gives us all links on a page.
links = gui_driver.find_elements(By.TAG_NAME, "a")
We can query the attributes of UI elements – for instance, the URL the first anchor on the page links to:
links[0].get_attribute('href')
'http://127.0.0.1:8800/terms'
What happens if we click on it? Very simple: We switch to the Web page being referenced.
links[0].click()
print_httpd_messages()
127.0.0.1 - - [30/Jun/2024 18:54:38] "GET /terms HTTP/1.1" 200 -
Image(gui_driver.get_screenshot_as_png())
Okay. Let's get back to our home page again.
gui_driver.back()