Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Test harness for executables

by sfink (Deacon)
on Nov 26, 2008 at 19:20 UTC ( #726194=perlmeditation: print w/replies, xml ) Need Help??

While I'm between jobs, I've been trawling through various old code that I've written to see if there's anything I could clean up and release. One thing that I have found particularly useful is a handrolled test harness I wrote to make it easier to create unit tests for a program I was working on.

I don't think there's anything stunningly novel about it, but I found it to strike a good balance between capability and simplicity. It was pretty easy for other people to pick up and start writing tests within, and for diagnosing the inevitable test failures.

I wrote it before the recent TAPification of Perl test tools, so it could probably make use of some of the newer modules. (I wrote this tool partly in response to the difficult of extending the older Test::Harness to do what I needed.)

Enough introduction. Why should anyone care? I'll try to enumerate what seems to be unique or vaguely unusual about this tool:

  • It is intended for processing the output of an application run on a given input file.
  • It uses a simple but extensible syntax for describing the set of tests to run.
  • The user can easily rerun specific failing tests with a very flexible way to pick the appropriate one. (This is slightly harder for this tool, since multiple tests can use the output of a single run.)
  • The output checks are simple regular expression matches. You may have multiple checks per test. Tests can be either positive or negative (X must match and Y must not match.) Before release, I'd be tempted to make this a bit more flexible.
  • Simple, TAP-compatible (I think) output
  • Convenient for smoke tests. Our continuous integration tool ran a large set of these tests after every build, and only created a release package if all tests passed. If a test failed, it would say exactly what it wanted and didn't get, together with the actual output.
And now for the full description. I fortunately documented it in POD, so I'll just insert the pod2html here, rewritten to remove as much of the application-specificity as I can (I'll need to remove it for real before releasing):

NAME - Test harness for running test files through an executable application


  perl --binary=/PATH/TO/APP --tests=SPEC foo.t

The test specification is a comma-separated list of test numbers or ranges. For example:

  perl -t 2,5-8,20- mytest.t

will run the tests given in the 'mytest.t' test configuration file numbered 2, 5, 6, 7, 8, and every test starting at 20.


Configuration file syntax looks something like:

  default {
    runtime = 2
  return_status_zero {
    outcome = ok
  new_version {
    min-version = 3.2.7
  test : return_status_zero, new_version {
    input = mytest/basic.xml
    desc = Basic test
    output {
      0: /correct/
      1: ! /Memory leak/

The configuration file is describing a hierarchical structure of named keys and their corresponding values. The above could be written (almost) equivalently as

  default.runtime = 2
  return_status_zero.outcome = ok
  new_version.min_version = 3.2.7
  test.input = mytest/basic.xml

A section with a name followed by an open curly bracket defines a "scope", meaning that every key seen within the body of the curly brackets is assumed to be prefixed with the path of that name.

Sections introduced with "section : parent1, parent2, ... {" do the same, except they also copy all of the key/value pairs from both 'parent1' and 'parent2' into that section. So that first example is also (almost) equivalent to:

  test.outcome = ok
  test.min-version = 3.2.7
  test.input = mytest/basic.xml

This syntax is provided so that you can have a large number of tests that are almost the same, but without repeating information for every one.

The section named 'default' is always inherited by everything, so it may be used for global configuration (eg, the entire test file should only be if the version is at least X.Y).

The only "special" section name is 'test'. 'test' is immediately expanded into 'test-NUMBER', so that each test is unique. (If you reused any other name, it would happily add to that section.)

The above description is completely generic, and says nothing about how tests are actually run.


Each of the 'test-NUMBER' sections specifies a test to be run. Normally, this means to invoke the application on whatever the key 'test-NUMBER.input' is set to, but if no input is given (or the special flag 'test-NUMBER.use-previous-run' is set to a true value), then the output of the previous run will be used instead.

Tests are primarily specified by the name of the input to run and a description of that test. The description is only used for human-friendly output, and is not required (it will default to something like "test at mytest.t:87"). It is really just for documentation.

The following keys are used to specify how to run a test spot:

  • input - the spot to run
  • binary - the path to the binary to use. Defaults to whatever was passed in to the --binary or -b command-line option. You'll probably never need to set this option.
  • flags - command-line flags to pass to the application
  • runtime - how long the application should be allowed to run.

The following keys are used to control whether or not this test should actually run:

  • skip - skip this test. This is an easy way to mark a test as "not ready yet".
  • min-version - skip the test if the application version is less than this value (as determined by the output of the --version flag). This allows having a common test suite across all versions of the application, but only running the applicable tests for any given version.
  • max-version - skip this test if the application version is greater than this value. Add this to a test for a feature or behavior that is meant to change in a later version. (What will actually happen is that you will put out a new version of the application, and it will break some older tests. After looking at the individual tests, you will see that some of them are testing things in a way that should no longer work, probably because they're really checking some idiosyncrasy of the feature. You can turn the test off without deleting it this way, and it will still be run against the older versions where it is expected to work.)
  • todo - if this is set, run the test but do not consider it a failure if it does not pass. This is for when you want to write a test for a feature you eventually want to start working. If the test passes, it will say "test (whatever) unexpectedly passed".

The following keys are used to describe what is actually being tested:

  • outcome - either "ok" or "crash". This checks the return status code.
  • output - the main testing interface. I will describe it below.


The 'output' key can be set to a plain string, which will then be expected in the output.

More often, 'output' can be /regex/ (a regular expression surrounded by slashes), or m!regex! (same thing, but more convenient if what you are looking for itself contains slashes.) Each line of the output will be scanned for the regular expression, and at least one occurrence must be found for the test to pass. If you want to check for something that could span lines, use the /s modifier:

  test {
    output = /initialized.*awakened/s

In this case, the regex will be applied to the entire output, not just a line at a time. (Any other Perl regex modifiers can be added in the same way, but /s is treated specially.)

Finally, you can prefix the output value with an exclamation point to say that the test passes only if the regex does NOT match:

  test {
    output = ! /bad stuff/

If you want to test multiple expressions at the same time, you have two choices: (1) have separate tests that use the same run of the reactor, or (2) make output into its own key/value section:

  output {
    0: /foo/
    1: /bar/
    bleh: /blort/
So my question is: is this of interest to anyone? I'm not inclined to do the work of polishing it up and severing it from the one specific application it was written for if nobody is going to use it. And I know there are similar tools out there now, which I haven't looked at enough to know whether my tool is of any use now.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://726194]
Approved by Old_Gray_Bear
Front-paged by Arunbear
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2022-08-09 02:01 GMT
Find Nodes?
    Voting Booth?

    No recent polls found