README for yarn, a scenario testing tool
* You need Python 2. Yarn's dependencies do not work with Python 3.
* Install ttystatus: <http://git.liw.fi/cgi-bin/cgit/cgit.cgi/ttystatus>
* Install cliapp: <http://git.liw.fi/cgi-bin/cgit/cgit.cgi/cliapp>
* Install Python-Markdown:
* You can install them from source, or using your operating system's
package manager (if they've been packaged for your operating system;
Debian has them: `python-ttystatus`, `python-cliapp`,
* Similarly, install yarn from source or with the package manager
(in Debian: `cmdtest`).
* Installation from source requires cloning the git repository and
running `python setup.py install`, perhaps with additional options
(see `--help`) to specify installation location, etc. Read also the
installation documentation of the libraries.
`yarn` is a scenario testing tool: you write a scenario describing how a
user uses your software and what should happen, and express, using
very lightweight syntax, the scenario in such a way that it can be tested
automatically. The scenario has a simple, but strict structure:
SCENARIO name of scenario
GIVEN some setup for the test
WHEN thing that is to be tested happens
THEN the post-conditions must be true
As an example, consider a very short test scenario for verifying that
a backup program works, at least for one simple case.
SCENARIO basic backup and restore
GIVEN some live data in a directory
AND an empty backup repository
WHEN a backup is made
THEN the data can be restored
(Note the addition of AND: you can have multiple GIVEN, WHEN, and
THEN statements. The AND keyword makes the text be more readable.)
Scenarios are meant to be written in mostly human readable language.
However, they are not free form text. In addition to the GIVEN/WHEN/THEN
structure, the text for each of the steps needs a computer-executable
implementation. This is done by using IMPLEMENTS. The backup scenario
from above might be implemented as follows:
IMPLEMENTS GIVEN some live data in a directory
rm -rf "$DATADIR/data"
echo foo > "$DATADIR/data/foo"
IMPLEMENTS GIVEN an empty backup repository
rm -rf "$DATADIR/repo"
IMPLEMENTS WHEN a backup is made
backup-program -r "$DATADIR/repo" "$DATADIR/data"
IMPLEMENTS THEN the data can be restored
restore-program -r "$DATADIR/repo" "$DATADIR/restored"
diff -rq "$DATADIR/data" "$DATADIR/restored"
Each "IMPLEMENT GIVEN" (or WHEN, THEN) is followed by a regular
expression on the same line, and then a shell script that gets executed
to implement any step that matches the regular expression. The
implementation can extract data from the match as well: for example,
the regular expression might allow a file size to be specified.
The above example seems a bit silly, of course: why go to the effort
to obfuscate the various steps? The answer is that the various steps,
implemented using IMPLEMENTS, can be combined in many ways, to test
different aspects of the program being tested. In effect, the IMPLEMENTS
sections provide a vocabulary which the scenario writer can use to
express a variety of usefully different scenarios, which together
test all the aspects of the software that need to be tested.
Moreover, by making the step descriptions be human language
text, matched by regular expressions, most of the test can
hopefully be written, and understood, by non-programmers. Someone
who understands what a program should do, could write tests
to verify its behaviour. The implementations of the various
steps need to be implemented by a programmer, but given a
well-designed set of steps, with enough flexibility in their
implementation, that quite a good test suite can be written.
Test language specification
A test document is written in [Markdown][markdown], with block
quoted code blocks being interpreted specially. Each block
must follow the syntax defined here.
* Every step in a scenario is one logical line, and starts with a
keyword. A step can be continued on the next physical line by
starting the next line with `...` (three dots, followed by a space).
* Each implementation (IMPLEMENTS) starts as a new block, and
continues until there is a block that starts with another
The following keywords are defined.
* **SCENARIO** starts a new scenario. The rest of the line is the name of
the scenario. The name is used for documentation and reporting
purposes only and has no semantic meaning. SCENARIO MUST be the
first keyword in a scenario, with the exception of IMPLEMENTS.
The set of documents passed in a test run may define any number of
scenarios between them, but there must be at least one or it is a
test failure. The IMPLEMENTS sections are shared between the
documents and scenarios.
* **ASSUMING** defines a condition for the scenario. The rest of the
line is "matched text", which gets implemented by an
IMPLEMENTS section. If the code executed by the implementation
fails, the scenario is skipped.
* **GIVEN** prepares the world for the test to run. If
the implementation fails, the scenario fails.
* **WHEN** makes the change to the world that is to be tested.
If the code fails, the scenario fails.
* **THEN** verifies that the changes made by the GIVEN steps
did the right thing. If the code fails, the scenario fails.
* **FINALLY** specifies how to clean up after a scenario. If the code
fails, the scenario fails. All FINALLY blocks get run either when
encountered in the scenario flow, or at the end of the scenario,
regardless of whether the scenario is failing or not.
* **AND** acts as ASSUMING, GIVEN, WHEN, THEN, or FINALLY: whichever
was used last. It must not be used unless the previous step was
one of those, or another AND.
* **IMPLEMENTS** is followed by one of ASSUMING, GIVEN, WHEN, or THEN,
and a PCRE regular expression, all on one line, and then further
lines of shell commands until the end of the block quoted code
block. Markdown is unclear whether an empty line (no characters,
not even whitespace) between two block quoted code blocks starts a
new one or not, so we resolve the ambiguity by specifiying that a
code block directly following a code block is a continuation unless
it starts with one of the scenario testing keywords.
The shell commands get parenthesised parts of the match of the
regular expression as environment variables (`$MATCH_1` etc). For
example, if the regexp is "a (\d+) byte file", then `$MATCH_1` gets
set to the number matched by `\d+`.
The test runner creates a temporary directory, whose name is
given to the shell code in the `DATADIR` environment variable.
The test runner sets the `SRCDIR` environment variable to the
path to the directory it was invoked from (by convention, the
root of the source tree of the project).
The test runner removes all other environment variables, except
`TERM`, `USER`, `USERNAME`, `LOGNAME`, `HOME`, and `PATH`. It also
forces `SHELL` set to `/bin/sh`, and `LC_ALL` set to `C`, in order
to have as clean an environment as possible for tests to run in.
The shell commands get invoked with `/bin/sh -eu`, and need to
be written accordingly. Be careful about commands that return a
non-zero exit code. There will eventually be a library of shell
functions supplied which allow handling the testing of non-zero
exit codes cleanly. In addition functions for handling stdout and
stderr will be provided.
The code block of an IMPLEMENTS block fails if the shell
invocation exits with a non-zero exit code. Output to stderr is
not an indication of failure. Any output to stdout or stderr may
or may not be shown to the user.
* The name of each scenario (given with SCENARIO) must be unique.
* All names of scenarios and steps will be normalised before use
(whitespace collapse, leading and trailing whitespace
* Every ASSUMING, GIVEN, WHEN, THEN, FINALLY must be matched by
exactly one IMPLEMENTS. The test runner checks this before running
* Every IMPLEMENTS may match any number of ASSUMING, GIVEN, WHEN,
THEN, or FINALLY. The test runner may warn if an IMPLEMENTS is unused.
* If ASSUMING fails, that scenario is skipped, and any FINALLY steps
are not run.
Wikipedia has an article on [Behaviour Driven Development][BDD],
which can provide background and further explanation to what this
tools tries to do.
* Add DEFINING, PRODUCING, if they turn out to be useful.
* Need something like ASSUMING, except fail the scenario if the
pre-condition is not true. Useful for testing that you can ssh
to localhost when flinging, for example.
**DJAS**: We think this might be 'REQUIRING' and it still does
not run the FINALLY group.