Test Format

Planemo has traditionally been used to test Galaxy tools.

$ planemo test galaxy_tool.xml

This starts a Galaxy instance, runs the tests described in the XML file, prints a nice summary of the test results (pass or fail for each test) in the console and creates an HTML report in the current directory. Additional bells and whistles include the ability to generate XUnit reports, publish test results and get embedded Markdown to link to them for PRs, and test remote artifacts in Git repositories.

Much of this same functionality is now also available for Galaxy Workflows as well as Common Workflow Language (CWL) tools and workflows. The rest of this page describes this testing format and testing options for these artifacts - for information about testing Galaxy tools specifically using the embedded tool XML tests see Test-Driven Development of Galaxy tools tutorial.

Unlike the traditional Galaxy tool approach, these newer types of artifacts should define tests in files located next artifact. For instance, if planemo test is called on a Galaxy workflow called ref-rnaseq.ga tests should be defined in ref-rnaseq-tests.yml or ref-rnaseq-tests.yaml. If instead it is called on a CWL tool called seqtk_seq.cwl, tests can be defined in seqtk_seq_tests.yml for instance.

Below are two examples of such YAML files - the first for a CWL tool and the second for Galaxy workflow. Note the same testing file format is used for both kinds of artifacts.

- doc: simple usage test
  job: pear_job.yml
  outputs:
    assembled_pairs:
      path: test-data/pear_assembled_results1.fastq
    unassembled_forward_reads:
      path: test-data/pear_unassembled_forward_results1.fastq
- doc: Test sample data for Microbial variant calling workflow
  job:
    mutant_R1:
      class: File
      path: mutant_R1.fastq
    mutant_R2:
      class: File
      path: mutant_R2.fastq
    wildtype.fna:
      class: File
      location: https://zenodo.org/record/582600/files/wildtype.fna
    wildtype.gbk:
      class: File
      location: https://zenodo.org/record/582600/files/wildtype.gbk
    wildtype.gff:
      class: File
      location: https://zenodo.org/record/582600/files/wildtype.gff
  outputs:
    jbrowse_html:
      asserts:
        has_text:
          text: "JBrowseDefaultMainPage"
    snippy_fasta:
      asserts:
        has_line:
          line: '>Wildtype Staphylococcus aureus strain WT.'
    snippy_tabular:
      asserts:
        has_n_columns:
          n: 2

The above examples illustrate that each test file is broken into a list of test cases. Each test case should have a doc describing the test, a job description the describes the inputs for an execution of the target artifact, and an outputs mapping that describes assertions about outputs to test.

job

The job object can be a mapping embedded right in the test file or a reference to a an external “job” input file. The job input file is a proper CWL job document - which is fairly straight forward as demonstrated in the above examples. Planemo adapts the CWL job document to Galaxy workflows and tools - using input names for Galaxy tools and input node labels for workflows.

Input files can be specified using either path attributes (which should generally be file paths relative to the artifact and test directory) or location (which should be a URI). The examples above demonstrate using both paths relative to the tool file and test data published to Zenodo.

Embedded job objects result in cleaner test suites that are simpler to read. One advantage of instead using external job input files is that the job object can be reused to invoke the runnable artifact outside the context of testing with planemo run.

Note

These job objects can be run directly with planemo run.

$ planemo run --engine=<engine_type> [ENGINE_OPTIONS] [ARTIFACT_PATH] [JOB_PATH]

This should be familar to CWL developers - and indeed if --engine=cwltool this works as a formal CWL runner. Planemo provides a uniform interface to Galaxy for Galaxy workflows and tools though using the same CLI invocation if --engine=galaxy (for a Planemo managed Galaxy instance), --engine=docker_galaxy (for a Docker instance of Galaxy launched by Planemo), or --engine=external_galaxy (for a running remote Galaxy instance).

Certain Galaxy objects don’t map cleanly to CWL job objects so Planemo attempts to extend the format with new constructs for running and testing Galaxy objects - such as describing collections and composite inputs.

Galaxy Collection Inputs

The following example demonstrates two ways to create input lists for Galaxy tests.

- doc: Test Explicit Collection Creation.
  job:
    input1:
      class: Collection
      collection_type: list
      elements:
        - identifier: el1
          class: File
          path: hello.txt
  outputs:
    wf_output_1:
      checksum: "sha1$a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"
- doc: Test CWL-style list inputs.
  job:
    input1:
      - class: File
        path: hello.txt
  outputs:
    wf_output_1:
      checksum: "sha1$a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"

Simply specifying files in YAML lists in the input job (like vanilla CWL job descriptions) will result in a simple Galaxy list. This is simple but the downside is you have no control of the list identifiers - which are often important in Galaxy workflows. When more control is desired, you may describe an explicit Galaxy collection with an input object of class: Collection. This variant (also shown in the above example) allows creating collections of type other than list and allows specifying element identifiers with the identifier declaration under the list of collection elements.

The explicit Galaxy collection creation syntax also makes describing nested collections such as lists of pairs very natural. The following example is used in Planemo’s test suite to illustrate this:

- doc: Test Explicit Collection Creation.
  job:
    input1:
      class: Collection
      collection_type: 'list:paired'
      elements:
        - class: Collection
          type: paired
          identifier: el1
          elements:
          - identifier: forward
            class: File
            path: hello.txt
          - identifier: reverse
            class: File
            path: hello.txt
  outputs:
    wf_output_1:
      checksum: "sha1$7bd92c6cd84285e4fc7215d506bbabfe328acb8f"

Galaxy Composite Inputs

The syntax for specifying composite inputs is a little more basic still and simply must be specified as a list of local files (mirroring Galaxy Tool XML test syntax). While class is assumed to be File and URIs aren’t yet tested.

- doc: Test Composite Inputs
  job:
    input1:
      class: File
      filetype: imzml
      composite_data:
        - path: Example_Continuous.imzML
        - path: Example_Continuous.ibd
  outputs:
    wf_output_1:
      checksum: "sha1$0d2ad51f69d7b5df0f4d2b2a47b17478f2fca509"

Galaxy Tags

Requires Galaxy 20.09 or newer.

Tags and group tags play important roles in many Galaxy workflows. These can be tested by simply add a list of tags: to the YAML corresponding to the dataset in the collection. The following example demonstrates this:

- doc: Test using tags.
  job:
    input_c:
      class: Collection
      collection_type: list
      elements:
        - identifier: el1
          class: File
          path: hello.txt
          tags: ['group:which:moo']
        - identifier: el2
          class: File
          path: not_hello.txt
          tags: ['group:which:cow']
  outputs:
    wf_output_1:
      checksum: "sha1$a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"

outputs

Galaxy tools and CWL artifacts have obvious output names that much match the mapping in this block on test file. Galaxy workflows require explicit output labels to be used with tests, but the important outputs in your workflows should be labeled anyway to work with Galaxy subworkflows and more cleanly with API calls.

If an output is known, fixed, and small it makes a lot of sense to just include a copy of the output next to your test and set file: relative/path/to/output in your output definition block as show in the first example above. For completely reproducible processes this is a great guarentee that results are fixed over time, across CWL engines and engine versions. If the results are fixed but large - it may make sense to just describe the outputs by a SHA1 checksum.

- doc: Simple concat workflow test
  job: wf1.gxwf-job.yml
  outputs:
    wf_output_1:
      checksum: "sha1$a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"

One advantage of included an exact file instead of a checksum is that Planemo can produce very nice line by line diffs for incorrect test results by comparing an expected output to an actual output.

There are reasons one may not be able to write such exact test assertions about outputs however, perhaps date or time information is incorporated into the result, unseeded random numbers are used, small numeric differences occur across runtimes of interest, etc.. For these cases, a variety of other assertions can be executed against the execution results to verify outputs. The types and implementation of these test assertions match those available to Galaxy tool outputs in XML but have equivalent YAML formulations that should be used in test descriptions.

Even if one can write exact tests, a really useful technique is to write sanity checks on outputs as one builds up workflows that may be changing rapidly and developing complex tools or worklflows via a Test-Driven Development cycle using Planemo. Tests shouldn’t just be an extra step you have to do after development is done, they should guide development as well.

The workflow example all the way above demonstrates some assertions one can make about the contents of files. The full list of assertions available is only documented for the Galaxy XML format but it is straightforward to adapt to the YAML format above - check out the Galaxy XSD for more information.

Some examples of inexact file comparisons derived from an artificial test case in the Planemo test suite is shown below, these are more options available for checking outputs that may change in small ways over time.

- doc: test_sha1_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      checksum: sha1$2ef7bde608ce5404e97d5f042f95f89f1c232871
- doc: test_sha1_fail
  job: cat_tool_job.json
  outputs:
    output_file:
      checksum: sha1$2ef7bde608ce5404e97d5f042f95f89f1c232872
- doc: test_compare_direct_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      file: hello.txt
- doc: test_compare_sim_size_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello.txt
      compare: sim_size
      delta: 5
- doc: test_compare_sim_size_fail
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello.txt
      compare: sim_size
      delta: 3
- doc: test_compare_re_match_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      file: hello_regex.txt
      compare: re_match
- doc: test_compare_re_match_fail
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello_regex.txt
      compare: re_match
- doc: test_compare_re_match_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      file: hello_regex.txt
      compare: re_match_multiline
- doc: test_compare_re_match_fail
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello_regex.txt
      compare: re_match_multiline
- doc: test_contains_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      file: hello_truncated.txt
      compare: contains
- doc: test_contains_fail
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello.txt
      compare: contains
- doc: test_diff_pass
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello.txt
      compare: diff
      lines_diff: 2
- doc: test_diff_fail
  job: cat_tool_job.json
  outputs:
    output_file:
      file: not_hello.txt
      compare: diff
      lines_diff: 1

Engines for Testing

Below are descriptions of various testing engines that can be used with Planemo (both with the test command and the run command) as well as some command-line options of particular interest for testing. The first two types cwltool and toil can be used to test CWL artifacts (tools and workflows). The remaining engine types are variations on engines that target Galaxy and are useful for testing workflows (and tools with newer style tests or job documents).

cwltool

$ planemo test --engine cwltool [--no-container] [--biocontainers]

This is the most straight forward engine, it can be used to test CWL tools and workflows using the CWL reference implementation cwltool (bundled as a dependency of Planemo). Use the --no-container option to disable Docker and use Conda resolution of SoftwareRequirement``s or applications on the ``PATH. Use the --biocontainers flag to use BioContainers for tools without explicit DockerRequirement hints.

toil

$ planemo test --engine toil [--no-container] [--biocontainers]

This engine largely mirrors the cwltool engine but runs CWL artifacts using Toil. Toil is an optional dependency of Planemo so you will likely have to install it in Planemo’s environment using pip install toil.

galaxy

$ planemo test [--docker] [--biocontainers] [--profile <profile>] [--galaxy_root <path>] [--extra_tools <path>]

This is the default engine type, but can be made explicit --engine galaxy. With this engine Planemo will start a Galaxy instance and test against it.

Planemo will automatically detect and load “stock” Galaxy tools used by workflows and install any Tool Shed tools contained in the workflow, if other non-Tool Shed tools are required for a workflow they can be loaded using --extra_tools.

Set --galaxy_root to target an externally cloned Galaxy directory or use --galaxy_branch to target a particular branch of Galaxy other than the latest stable.

Use the --biocontainers flag to enable Docker and use BioContainers for tools or use --docker to use Docker but limited to tools configured with container tags.

By default Galaxy when configured by Planemo will attempt to run with an sqlite database. This configuration is quite buggy and should not be used to test workflows. The --profile option can be used to target a pre-configured Postgres database created with planemo profile_create and use it for testing. In addition to making Galaxy more robust this should speed up testing after the initial setup of the database.

planemo profile_create --database_type [postgres|docker_postgres] my_cool_name
planemo test --profile my_cool_name

If --database_type is specified as docker_postgres, Planemo will attempt to startup a postgres server in a Docker container automatically for testing. If instead postgres is specified Planemo will attempt to interact with Postgres using psql (assumed to be on the PATH). For a description on more Postgres connection options check out the documentation for the database_create command that has similar options.

Profiles may also really help testing local setups by saving previously installed shed repository installations and Conda environments.

docker_galaxy

$ planemo test --engine docker_galaxy [--extra_tools <path>] [--docker_extra_volume <path>] [--docker_galaxy_image <image>]

With this engine Planemo will start a Docker container to run tests against it. See the docker-galaxy-stable project spearheaded by Björn Grüning for more information on Docker-ized Galaxy execution. The exact container image to use can be controlled using the --docker_galaxy_image option.

Planemo will automatically detect and load “stock” Galaxy tools used by workflows and install any Tool Shed tools contained in the workflow, if other non-Tool Shed tools are required - they can be loaded using --extra_tools.

At the time of this writing, there is a bug in Planemo that requires using the --docker_extra_volume option to mount test data into the testing container.

external_galaxy

$ planemo test –engine external_galaxy –galaxy_admin_key <admin_key> –galaxy_user_key <user_key> [–no_shed_install] [–polling_backoff <integer>] –galaxy_url <url>

This is primarily useful for testing workflows against already running Galaxy instances. An admin or bootstrap API key should be supplied to install missing tool repositories for the workflow and a user API key should be supplied to run the workflow using. If you wish to skip tool shed repository installation (this requires all the tools be present already), use the --no_shed_install option. If you want to reduce the load on the target Galaxy while checking for the status changes use the --polling_backoff <integer> option where integer is the incremental increase in seconds for every request.

To run tool tests against a running Galaxy, galaxy-tool-test is a script that gets installed with galaxy-tool-util and so may very well already be on your PATH. Check out the options available with that using galaxy-tool-test --help. If you’re interested in running all the tool tests corresponding to a workflow on a running server, check out the galaxy-workflow-tool-tests project that is a wrapper around galaxy-tool-test that has all the same options but that filters to the tool tests to just run those from a specific workflow.

This engine can also be used to test workflows already available in the running Galaxy instance. While you don’t need to download and synchronize the target workflow on your local filesystem, you do need to provide a path to find the test definition and test data paths.

An example of doing this is included in Planemo’s test data. The workflow test definition wf11-remote.gxwf-test.yml exists but no corresponding workflow file wf11-remote.gxwf.yml exists. The workflow is assumed to already exist in some Galaxy server. For instance, it might exist somewhere with id 99113b2b119318e1. Then planemo test could be run with gxid://workflows/99113b2b119318e1?runnable_path=/path/to/wf11-remote.gxwf.yml as the last argument to test this workflow with that test data. Note this path /path/to/wf11-remote.gxwf.yml doesn’t need to exist, but it is used to find wf11-remote.gxwf-test.yml.

Galaxy Testing Template

The following a script that can be used with continuous integration (CI) services such Travis to test Galaxy workflows in a Github repository. This shell script can be configured via various environment variables and shows off some of the modalities Planemo test should work in (there may be bugs but we are trying to stablize this functionality).

#!/bin/bash

# Usage: http://planemo.readthedocs.io/en/latest/test_format.html#galaxy-testing-template

: ${PLANEMO_TARGET:="planemo==0.52.0"}
: ${PLANEMO_OPTIONS:=""}  # e.g. PLANEMO_OPTIONS="--verbose"
: ${PLANEMO_PROFILE_NAME:="wxflowtest"}
: ${PLANEMO_SERVE_PORT:="9019"}
: ${PLANEMO_GALAXY_BRANCH:="master"}
: ${PLANEMO_TEST_STYLE:="serve_and_test"}  # profile_serve_and_test, serve_and_test, docker_serve_and_test, test, docker_test, docker_test_path_paste
: ${PLANEMO_SERVE_DATABASE_TYPE:="postgres"}  # used if not using Docker with PLANEMO_TEST_STYLE
: ${PLANEMO_DOCKER_GALAXY_IMAGE:="quay.io/bgruening/galaxy:20.05"}  # used if used Docker with PLANEMO_TEST_STYLE
: ${PLANEMO_VIRTUAL_ENV:=".venv"}
: ${GALAXY_URL:="http://localhost:$PLANEMO_SERVE_PORT"}

# Ensure Planemo is installed.
if [ ! -d "${PLANEMO_VIRTUAL_ENV}" ]; then
    virtualenv "${PLANEMO_VIRTUAL_ENV}"
    . "${PLANEMO_VIRTUAL_ENV}"/bin/activate
    pip install -U pip>7
    # Intentionally expand wildcards in PLANEMO_TARGET.
    shopt -s extglob
    pip install ${PLANEMO_TARGET}
fi
. "${PLANEMO_VIRTUAL_ENV}"/bin/activate

# Run test.
# This example shows off a bunch of different ways one could test with Planemo,
# but for actual workflow testing projects - probably best just to take one of the last
# two very easy invocations to simplify things.
if [ "$PLANEMO_TEST_STYLE" = "profile_serve_and_test" ]; then
    planemo $PLANEMO_OPTIONS profile_create \
        --database_type "$PLANEMO_SERVE_DATABASE_TYPE" \
        "$PLANEMO_PROFILE_NAME"
    planemo $PLANEMO_OPTIONS serve \
        --daemon \
        --galaxy_branch "$PLANEMO_GALAXY_BRANCH" \
        --profile "$PLANEMO_PROFILE_NAME" \
        --port "$PLANEMO_SERVE_PORT" \
        "$1"
    planemo $PLANEMO_OPTIONS test \
        --galaxy_url "$GALAXY_URL" \
        --engine external_galaxy \
        "$1"
elif [ "$PLANEMO_TEST_STYLE" = "serve_and_test" ]; then
    planemo $PLANEMO_OPTIONS serve \
        --daemon \
        --galaxy_branch "$PLANEMO_GALAXY_BRANCH" \
        --database_type "$PLANEMO_SERVE_DATABASE_TYPE" \
        --port "$PLANEMO_SERVE_PORT" \
        "$1"
    planemo $PLANEMO_OPTIONS test \
        --galaxy_url "$GALAXY_URL" \
        --engine external_galaxy \
        "$1"
elif [ "$PLANEMO_TEST_STYLE" = "docker_serve_and_test" ]; then
    docker pull "${PLANEMO_DOCKER_GALAXY_IMAGE}"
    planemo $PLANEMO_OPTIONS serve \
        --daemon \
        --engine docker_galaxy \
        --docker_galaxy_image "${PLANEMO_DOCKER_GALAXY_IMAGE}" \
        --port "$PLANEMO_SERVE_PORT" \
        "$1"
    planemo $PLANEMO_OPTIONS test \
        --galaxy_url "$GALAXY_URL" \
        --engine external_galaxy \
        "$1"
elif [ "$PLANEMO_TEST_STYLE" = "test" ]; then
    # TODO: this conda_init shouldn't be needed, but this mode is broken without it.
    planemo conda_init || true

    planemo $PLANEMO_OPTIONS test \
        --database_type "$PLANEMO_SERVE_DATABASE_TYPE" \
        --galaxy_branch "$PLANEMO_GALAXY_BRANCH" \
        "$1"
elif [ "$PLANEMO_TEST_STYLE" = "docker_test" ]; then
    # TODO: This variant isn't super usable yet because there is too much logging, hence the dev null
    # redirect.
    docker pull "${PLANEMO_DOCKER_GALAXY_IMAGE}"
    planemo $PLANEMO_OPTIONS test \
        --engine docker_galaxy \
        --docker_galaxy_image "${PLANEMO_DOCKER_GALAXY_IMAGE}" \
        "$1" > /dev/null
elif [ "$PLANEMO_TEST_STYLE" = "docker_test_path_paste" ]; then
    # Same as above but mount the test data and use file:// path pastes when uploading
    # files (more robust and quick if working with really large files).
    docker pull "${PLANEMO_DOCKER_GALAXY_IMAGE}"
    planemo $PLANEMO_OPTIONS test \
        --engine docker_galaxy \
        --docker_extra_volume . \
        --paste_test_data_paths \
        --docker_galaxy_image "${PLANEMO_DOCKER_GALAXY_IMAGE}" \
        "$1" > /dev/null
elif [ "$PLANEMO_TEST_STYLE" = "manual_docker_run_and_test" ]; then
    docker pull "${PLANEMO_DOCKER_GALAXY_IMAGE}"
    docker run -d -e "NONUSE=nodejs,proftp,reports" -p "${PLANEMO_SERVE_PORT}:80" "${PLANEMO_DOCKER_GALAXY_IMAGE}"
    galaxy-wait -g "http://localhost:${PLANEMO_SERVE_PORT}"
    planemo $PLANEMO_OPTIONS test \
        --engine external_galaxy \
        --galaxy_url "$GALAXY_URL" \
        --galaxy_admin_key admin \
        --galaxy_user_key admin \
        "$1"
elif [ "$PLANEMO_TEST_STYLE" = "external_galaxy" ]; then
    if [[ -n $PLANEMO_INSTALL_TOOLS ]]; then
        INSTALL_TOOLS="";
    else
        INSTALL_TOOLS="--no_shed_install";
    fi
    planemo $PLANEMO_OPTIONS test \
        --engine external_galaxy \
        --galaxy_url "$GALAXY_URL" \
        --galaxy_admin_key "$PLANEMO_ADMIN_KEY" \
        --galaxy_user_key "$PLANEMO_USER_KEY" \
        $INSTALL_TOOLS \
        "$1"
else
    echo "Unknown test style ${PLANEMO_TEST_STYLE}"
    exit 1
fi

A Travis configuration file (.travis.yml) that would test workflows using a Docker Galaxy image might look like:

language: python
sudo: true
python: 2.7
env:
  global:
    - PLANEMO_TEST_SCRIPT=https://raw.githubusercontent.com/galaxyproject/planemo/master/scripts/run_galaxy_workflow_tests.sh
    - PLANEMO_TEST_STYLE=docker_serve_and_test
    - PLANEMO_TARGET="planemo==0.52.0"
    - PLANEMO_DOCKER_GALAXY_IMAGE="quay.io/bgruening/galaxy:18.01"
  matrix:
    - WORKFLOW_TEST=example1/ref-rnaseq.ga
    - WORKFLOW_TEST=example2/chipseq.ga

script: bash <(curl -s "$PLANEMO_TEST_SCRIPT") "$WORKFLOW_TEST"

services:
  - docker

To skip Docker and instead test with a native Galaxy instance and postgres database one might use the configuration:

language: python
python: 2.7
env:
  global:
    - PLANEMO_TEST_SCRIPT=https://raw.githubusercontent.com/galaxyproject/planemo/master/scripts/run_galaxy_workflow_tests.sh
    - PLANEMO_TEST_STYLE=serve_and_test
    - PLANEMO_TARGET="planemo==0.52.0"
    - PLANEMO_GALAXY_BRANCH="release_18.05"
   matrix:
    - WORKFLOW_TEST=example1/ref-rnaseq.ga
    - WORKFLOW_TEST=example2/chipseq.ga

script: bash <(curl -s "$PLANEMO_TEST_SCRIPT") "$WORKFLOW_TEST"

services:
  - postgres