Setting up tests for cat-type filter programs

dmorgo has asked for the wisdom of the Perl Monks concerning the following question:

I have some Perl programs that are called like this (a simplification, but it captures the gist):

cat data.txt | prog1.pl | prog2.pl option | prog3.pl > out.txt
[download]

I'd like to set up simple test scripts, one per program, to check whether certain input data yields the expected output:

# input file
good1
good2
bad1
good3

# expected output
good1
good2
good3
[download]

So, just to reemphasize, there would be a separate test for each program, prog1.pl, prog2.pl, and prog3.pl, as opposed to having just one test for the entire pipeline.

There are many ways to do this. I'm wondering what nice idioms people have come up with for this. Nice properties of a test harness would be that it keeps the input and expected

output data (which can be fake, usually) in a nicely plain text, easy to read and easy to edit format. That rules out this, for example:

my %testdata = (
                 'good1' => 'good1',
                 'good2' => 'good2',
                 'bad1' => '',
                 'good3' => 'good3',
               );
[download]

I seems possible all the programs could be tested by the same test script, using different data for each test script. On the other hand, I don't know if that's the best solution, because it seems nice for each test to be self contained.

Another choice is to use a __DATA__ section. That's OK, but it comes at the end of the file, decreasing ease of reading and editing a tad. And it would require another label be embedded and parsed... not that that's hard, but the simpler, the better.

Still another choice is HERE docs. I'm leaning toward that, but wondering if there are better suggestions:

my ($args = <<'ARGS')=~s/^\s+//gm;
arg1
ARGS

my ($input = <<'INPUT')=~s/^\s+//gm;
good1
good2
bad1
good3
INPUT

my ($expected_output = <<'OUTPUT')=~s/^\s+//gm;
good1
good2
good3
OUTPUT
[download]

Separate from the setup question is how to compare the actual and expected results. Or at least I thought at first that these were separate. But there may be some interaction between the two questions. When the data is stored in a hash, each element is being treated individually, with a placeholder for an empty result in some cases. With the HERE doc, when the result is empty, it just means a shorter list. The latter feels like a better approach, and for my application there doesn't need to be a one-to-one correspondence between each input and output item. My main concern is finding a nice way to set up the data. Thoughts or suggested idioms appreciated!

Comment on Setting up tests for cat-type filter programs Select or Download Code

Replies are listed 'Best First'.
Re: Setting up tests for cat-type filter programs by tachyon (Chancellor) on Oct 03, 2004 at 23:22 UTC
Your thoughts seem to focus on setting up data structures. That is IMHO wrong. You should focus on setting up a simple harness structure. You don't need (or probably want) a data structure for your tests. You want simple. You want easy. You want quick. You want understandable. You don't need easily modifiable as if the tests were valid in the first place they should always pass, thus you almost never delete from a test suite, you only add. It is also a hell of a lot easier to find a test if each test has its own chunk of code.... I would have one test script per script. Typically you call these test files 'prog_or_module_name.t' and put them in a t/ directory. Here is a suggested simple framework: #!/usr/bin/perl -s use Test; BEGIN{ plan tests => 3 }; my $prog = 'script.pl'; our $TMPFILE = '/tmp/tmpfile' . time(); my ( $input, $output, $opt ); # make sure program exists or we are bound to fail ok( -e $prog ); #1 # make sure we can write to our tmpfile ok( write_tmpfile('') ); #2 END{ unlink $TMPFILE }; # clean up at end # tests go here $input = ''; $output = ''; $opts = ''; ok( test_prog( $prog, $input, $opts ), $output ); #3 # ..... # we could use open2 or open3 but why complicate it.... sub test_prog { my ( $prog, $input, $opts ) = @_; write_tmpfile( $input ); my $actual_output = `cat $tmpfile \| $prog $opts`; return $actual_output; } sub write_tmpfile { open F, '>$TMPFILE' or return 0; print F $_[0]; close F; return 1; } [download] All you need to do to add a test is copy the input/output/opts/ok stub and fill in the blanks. As you find bugs you just add yet another test case. Assuming you go for the t/.t format you can make a little shell or perl script like this: `[root@devel3 t]# cat run #!/bin/sh /usr/bin/perl -e 'use Test::Harness qw(&runtests); runtests @ARGV;' /s +ome/path/t/.t` [download] And then just call run to run all your tests. UpdateFixed typo, thanks blyman cheers tachyon	[reply] [d/l] [select]
Re: Setting up tests for cat-type filter programs by skx (Parson) on Oct 03, 2004 at 19:35 UTC
If you're just reading in one input, piped, and producing one output then it seems like a simple job. Simply store the input for each program in a text file, and the output. Then use redirection and diff. For example you could use: `#!/bin/sh cat data/test1.in \| prog1 > data/test1.tmp diff data/test1.out data/test1.tmp` [download] That's the general idea. Pipe in the input, and redirect the output to a temporary file. Then compare that with the expected output. Rewriting that to iterate over a list of commands and data files, and using perl is a simple enough job I'm sure. Steve --- steve.org.uk	[reply] [d/l]
Re: Setting up tests for cat-type filter programs by InfiniteSilence (Curate) on Oct 03, 2004 at 19:19 UTC
Well, if you are writing a module and you use h2xs -X -n amodulename you get several files, one of which is a test file (in the /t folder). Consider the following (fake) module Conquest::MegaGoogle: type conquest-megagoogle.t # Before `make install' is performed this script should be runnable wi +th # `make test'. After `make install' it should work as `perl Conquest-M +egaGoogle.t' ######################### # change 'tests => 1' to 'tests => last_test_to_print'; use Test::More tests => 1; BEGIN { use_ok('Conquest::MegaGoogle') }; ######################### # Insert your test code below, the Test::More module is use()ed here s +o read # its man page ( perldoc Test::More ) for help writing this test scrip +t. [download] So the idea is to have a test script that uses the Test module. (perldoc Test will clarify things a bit) I mention this in the framework of creating a module because you may consider trying to place a lot of those smaller scripts in an actual module. This will make your maintenance a lot easier. After all of the functionality is neatly tucked away you can use GetOpt::Long or something to manage command line parameters. Scripts normally are: easier to lose across implementations confusing due to lots of duplicated parameters (e.g. -i means something different in prog2 than in prog1) redundant. Lots of them duplicate code unnecessarily (because objects have to be reloaded every prog) Celebrate Intellectual Diversity	[reply] [d/l]
Re: Setting up tests for cat-type filter programs by foss_city (Novice) on Oct 04, 2004 at 08:38 UTC
You might find Test::Harness ( http://search.cpan.org/~petdance/Test-Harness-2.42/ ) to be useful . . . .	[reply]
Re: Setting up tests for cat-type filter programs by adrianh (Chancellor) on May 31, 2005 at 13:02 UTC
There are many ways to do this. I'm wondering what nice idioms people have come up with for this. Nice properties of a test harness would be that it keeps the input and expected output data (which can be fake, usually) in a nicely plain text, easy to read and easy to edit format. A bit late in the day - but if you've not found it already Test::Chunks handles this rather nicely.	[reply]