- Structure
- Dependencies
- Running the Tests
- Viewing Failure Traces
- Test Code Standards and Style Guide
- How the Tests Work
- Why Aren't We Using Existing Django Testing Modules?
Feature files live in the features directory.
The steps implementations live in the steps directory, and it is recommended to group them into files based on the domain they interact with. For example, certain page editor interactions are common to different features, the steps for these can be made generic and kept in page_editor.py, rather than being duplicated per feature.
See the repo README section for the basic setup guide for dependencies.
These tests require the backing services for the app to be running, as they run a live server instance of the app. We accomplish this with docker compose, specifying the required backing services in the functional tests docker-compose.yml. These services are intentionally exposed on different ports to the main local development dependencies, to avoid a conflict and allow them to run simultaneously.
You can start the dependencies in the background with:
make functional-tests-up
and stop them with:
make functional-tests-down
Since the tests use a clean database state for every scenario, it will be easier to create tests on a similarly fresh,
empty database. However, it could quickly become inconvenient to have to tear down your main local environment every
time you want to replicate this state. To get around this, we can use this separate test instance, found at
http://localhost:18000/
.
To start this test development app instance in docker along with the dependencies, run:
make functional-tests-dev-up
Then to stop it when you are finished, run:
make functional-tests-down
Which will also stop and remove the functional tests development app along with the dependencies, if it is running.
To replicate the clean test database environment, you may want to completely clean your database.
You can use the django admin reset_db
to completely clear the database, then re-run migrations to re-initialise the
tables and seeded data. Ensure you include the DJANGO_SETTINGS_MODULE
environment variable, to avoid accidentally
wiping your normal local development environment.
poetry run python manage.py reset_db --settings=cms.settings.functional_test
poetry run python manage.py migrate --settings=cms.settings.functional_test
Then for logging into the CMS, create a superuser with
DJANGO_SETTINGS_MODULE=cms.settings.functional_test make createsuperuser
Resetting and migrating the DB will always work, but it is slow to run. A faster solution is using DSLR snapshots. Note however, that these snapshots may break if the database structure has changed, at which point you will need to reset and migrate again, then create a new, good snapshot.
Create a DSLR snapshot of the functional tests dev database with:
poetry run dslr --url postgresql://ons:ons@localhost:15432/ons snapshot <SNAPSHOT_NAME> # pragma: allowlist secret
Then restore it with
poetry run dslr --url postgresql://ons:ons@localhost:15432/ons restore <SNAPSHOT_NAME> # pragma: allowlist secret
See the main README functional tests section for the basic commands for running the tests.
Some Playwright configuration options can be passed in through environment variables
Variable | Description | Default |
---|---|---|
PLAYWRIGHT_HEADLESS | Toggle headless browser mode, set to "False" to show the browser window | True |
PLAYWRIGHT_SLOW_MO | Sets the Playwright slow mo mode in milliseconds | 0 |
PLAYWRIGHT_BROWSER | Set the browser for playwright to use, must be one of chromium , firefox , or webkit .NOTE: Currently only chromium is supported and tests may fail in other browsers |
chromium |
PLAYWRIGHT_TRACE | Toggle Playwright trace recording | True |
PLAYWRIGHT_TRACES_DIR | Sets the location to write Playwright trace files if PLAYWRIGHT_TRACE is enabled.The Default location <project_root>/tmp_traces is git ignored for convenience. |
<working_directory>/tmp_traces |
The tests record traces of all their actions, allowing you to follow through tests that previously ran and debug issues in remote environments.
Our GitHub Action is configured to save traces of any failed scenario and upload them.
-
Navigate to the failed Action functional test run, expand the "Upload Failure Traces" job step and click the download link to download the zip file of all failed scenario traces.
-
Unzip the downloaded file on the command line with
unzip <path_to_file>
(note that un-archiving using MacOS finder may not work as it recursively unzips the files inside, where we need the files inside to remain zipped)
-
This should leave you with a zip file for each failed scenario
-
Open the traces zip files one at a time using the Playwright Trace Viewer
You should then be able to step through the failed tests and get a better idea of the state and cause of the failure.
Some tests may require objects to be set up in the database, such as a user or set of pages that the feature relies
upon. For this, we can use Factory Boy to seed data
directly into the database. These factories should be shared with the rest of the tests and kept in the tests
modules
alongside the code for consistency. If the functional tests need different defaults or behaviour from the factories,
then they can make use of factory traits.
We make use of the Behave Context object to store data that is needed across multiple steps of a scenario.
To prevent the context from becoming cluttered and confusing to use, we define some rules for how we interact with it:
Other none step or hook functions shouldn't be passed the entire context and should certainly not modify it. Instead, pass in explicit variables from the context and return new ones as required. Try to make all none step functions pure/deterministic.
This is to avoid the context use becoming obscured and hard to follow, when context variable are only set in top level step functions then it is easy to follow where the variables for any particular scenario are set, simply by walking through the steps. Passing the entire context object down to lower level functions effectively obscures their true signatures, making the use and setting of context variables much harder to follow.
For example:
Tip
Do this:
@step('a thing happens')
def step_to_do_a_thing(context: Context):
context.a_new_context_var = helper_function(context.my_scenario_data)
...
def helper_function(my_scenario_data):
...
return new_data
Caution
Not this:
@step('a thing happens')
def step_to_do_a_thing(context: Context):
helper_function_which_overwrites_context(context)
...
def helper_function_which_overwrites_context(context: Context) -> None:
...
context.a_new_context_var = new_data
Step files should not import code from other step files, where code can be shared between steps they should either be in the same file, or the shared code should be factored out into the step_helpers module.
This is to avoid potential circular imports and make it clear which code is specific to certain steps, and which is reusable across any steps.
Note that it is also perfectly valid to annotate the same step function with multiple different step wordings, for example to have multiple different wordings of the step to make better grammatical sense in different scenarios.
Steps should be written in full and concise sentences, avoiding unnecessary abbreviations and shorthand. They should be as understandable and as non-technical as possible.
Assertions should use the Playwright assertions wherever possible. This has built in retry and timeout logic where applicable, to allow grace periods for content to load on the page, so will be more robust than attempting to retrieve data from the page and make plain python assertions.
Where we need step parameters to include more complex data than single strings or the other basic types supported by the default parser, we use custom registered types. These are registered in the environment.py so they are available to all steps.
Due to issues with the Django TransactionTestCase
which prevent
us using the built-in database teardown/setup in between scenarios, we have implemented our own database snapshot and
restore pattern between tests. We still make use of the Django test case, specifically the
LiveServerTestCase
to perform the initial database setup
and run a live server on a random port for the tests.
We are using DSLR for fast database snapshots and restores.
After we have used the Django test runner to set up the test database and initialise it by running migrations, we take a DSLR snapshot of this clean, initial state. In the test case fixture, post test, we then restore the clean snapshot, ensuring each test gets a clean, migrated database, isolated from other tests.
This setup is done in behave fixtures, kept in behave_fixtures.py. These fixtures are the registered in the behave hooks in the environment.py
To give the scenario steps access to a playwright page, we set up a Playwright instance along with a browser and
browser context in the before_all
hook, so that it is started once at the beginning of the run. In the
before_scenario
hook, we then create a Playwright page object to be used by the scenario, passed to through the behave
context. This page is closed in the after_scenario
hook, to ensure each scenario has its own separate page object.
If the PLAYWRIGHT_TRACE
environment variable is enabled, we also start trace recording at the beginning of the run,
and start a new trace "chunk" for each scenario, so that traces of individual failed scenarios can be saved to files.
At first glance it may appear our custom fixtures and database restore mechanism should be unnecessary, as there are multiple choices for modules out there which claim to do what we need. We tried these solutions first and ruled them out because of various incompatibilities with our app or testing requirements.
This is built on Pytest, and we decided to move our unit and integration testing away from Pytest because it has compatibility issues with our multi-DB configuration, so this would have suffered the same issue.
Live server testing is accomplished with a fixture, which under the hood uses a Django LiveServerTestCase
. This
inherits from the
TransactionTestCase
, which causes us serious compatibility
issues, as it uses an isolated test database and flushes all data in between tests. We have migrations which seed critical data rows, so a flush operation
breaks the app. The serialised_rollback
option for the test case may present a solution in the future, but this depends on restoring any migration seeded data
with fixtures, which currently runs into an issue with Wagtails Locale models.
We tried various workarounds such as using a fixture file to restore the data, but this runs into Wagtail issues, and even if it worked it would be non-ideal as that fixture file would have to be kept up to date and recreated when any new seeded data is added.
Behave-Django is a module which enables easier integration between the Behave BDD framework and a Django app. It uses
the Django StaticLiveServerTestCase
and Django test runner to wrap the Behave test runs. This means we run into the
exact same data flushing issue as we did with Pytest-BDD.
Also, the StaticLiveServerTestCase
is incompatible with Whitenoise, which we use to serve static content, so we would
have to override the test case. This was possible, but the setting was only exposed through command line arguments, so
it would make running the scenarios through an IDE with debugging features either imp