Regression testing with dependencies

Regression testing is important. When writing Perl modules with ExtUtils::MakeMaker, it's usually quite simple, as well. Personally, I often find writing these tests to be both fun and reassuring.

However, like many people, I'm guilty of not writing tests for Every Single Module I write or work on. Often when I skip writing tests, it is because the module I'm working on has external resource dependencies, which I don't think the tests can always depend on, and/or I don't want them to use for testing purposes. At best, this makes writing tests less than simple. At worst, it can make writing tests more expensive than just rewriting the module later when it's found to be broken.

Some examples of resources a module may depend on are (by "internal" I mean things the module programmers have direct control over; external things are outsourced or developed out-of-house):

database servers
other internal servers/services
other internal modules with resource dependencies your module shouldn't need to care about
external modules with unknown dependencies
external servers/services (Credit card fulfillment, for example)
firewall status on the server your tests are running on
environment: must be running under mod_perl/apache/etc. to work

We need to test the portions of the module which depend on these resources (it might be every part of the module), or the module isn't tested properly. But we can't always use the same resources for testing as for production, which also makes testing incomplete.

Making the assumption that your test scripts always have access to these resources can be fatal:

the resources may be unavailable to the test script/server, causing the tests to fail. Sometimes this means a resource needs to be installed; sometimes it means we've firewalled it to protect it from test scripts :)
the resources may be available, but set up in production mode, and tests may break the production data
externally controlled resources may not have a test mode available

In these conditions, what do people typically do to build useful test suites? I'm specifically thinking of the case where it would be impossible or impractical to simulate every resource dependency (such as when the amount of code in the new module is small, but the level of resource dependency is high: you'd spend more time simulating external resources than in writing or testing the code).

I'm also interested in techniques people use to ensure that only test resources are used while tests are being performed; especially when using a (possibly externally controlled) module with unknown resource dependencies.

Some solutions we've used, but which haven't provided a complete answer yet, are:

test databases with a different DSN in the test script
propagate generic "test mode" flags to modules with known dependencies. This isn't always doable for external modules.
don't write tests when the test is known to be destructive
use a "todo" test when it is likely to fail
comment tests which may fail due to known resource unavailability

Any other ideas? I may have thought too much and mostly answered my own question, but I'm still interested in reading how others have dealt with this sort of problem in the past. Thanks!

Alan

Comment on Regression testing with dependencies

Replies are listed 'Best First'.

Re: Regression testing with dependencies
by chromatic (Archbishop) on May 29, 2002 at 06:12 UTC

mock objects

Decouple the code as much as possible. (avoid the issue)
Assume a baseline of working behavior. (I'm pretty sure Perl works, and I can avoid a few places where it doesn't.)
Run smoke tests.
Run integration tests.
Use testing data.

[reply]

Re: Regression testing with dependencies

by ferrency (Deacon) on May 29, 2002 at 18:49 UTC

Is the typical technique to start with a functional specification and a room full of psuedo-users, and direct them to use, abuse, and eventually hopefully break the system? Or is there a commonly used technique to define a set of integration tests for a system, and then to automatically perform the tests and evaluate the outcome, similar to the way unit tests are done in Perl with the standard\ make test functionality?

What do you mean by "smoke tests"?

Decoupling code is important, but there's also a tradeoff, if you're using mock objects to unit test the operation. The smaller the pieces you break the project into, the more mock objects you need to create to test each piece; and, the closer in size those mock objects are to the real objects they're mocking.

Design by contract seems like a useful way to design and document a project, and provides a good basis for building unit tests (before the code is even written). However, when unit testing code at higher levels of abstraction/integration (but below the user interface level), you still have to deal with resource dependencies in the lower levels of code.

For example, if I'm writing a module which integrates the product fulfillment system with the credit card system, my module needs to rely on those low level systems in order to fulfill its contract. A contract for one method may state that it's responsible for shipping a specified product to a specified destination, and making sure that it's paid for. During unit tests for the module, we need to test that this contract is being fulfilled in principle, but we don't want to Actually ship 10000 widgets to Sri Lanka, paid for by the boss's account, in order to prove that it works. That's the level of the problem I'm trying to solve.

(Please generalize to the realm of electronic product delivery for that to make sense :)

Is this an inappropriate use of design by contract at too high a level of abstraction? A poorly worded contract? A poor design of the module? Or a valid concern which I still don't have an answer for yet?

Please excuse my apparent ignorance on these topics. Hopefully I'm asking questions others also need the answer to, but didn't want to ask. If anyone can point me to other resources that might help me understand and deal with these issues better, please do.

Thanks,

Alan

[reply]
[d/l]

Re: Regression testing with dependencies
by mstone (Deacon) on May 29, 2002 at 23:29 UTC

What you're talking about -- though indirectly -- is the need for configuration management.

All software starts from some basic set of assumptions. One of the main selling points for high level languages like Perl is that they give programmers a nice, coherent set of basic assumptions from which to work. But language is only part of the picture. Programs also rely on assumptions about their operating system, filesystems, libraries, devices, and a whole layer of formats and protocols that let them communicate with the rest of the world. That set of assumptions is collectively known as a 'configuration'.

Simple programs -- ones that spend all their time working with values in their own address space -- don't base make many assumptions about configuration above and beyond the language. In essence, any code that compiles will run. More complex programs -- ones that communicate with other parts of the system -- inherit a whole load of configuration-dependent assumptions for every piece they rely on.

(and please note that in this context, 'simple' and 'complex' have no relation to the functional code itself. a genetic annealing simulation with no serious I/O would be 'simple', while a hit counter that calls a database would be 'complex')

Configuration management is the art of nailing down all the assumptions made by a given piece of software. Some of those assumptions will be internal to the language, as you mentioned, while others will involve software the programmer doesn't control. Some will reside on the machine where the program executes, while others may live on other machines (a program that connects to a departmental database server, for instance). The configuration schedule for a given program should tell you exactly how to build an environment in which that program will run as expected.

As Chromatic mentioned, you can use mock objects to build abstraction barriers into your program, and design by contract gives you a way to specify the conditions a dependency has to meet in order to work and play nicely with your software. In the long run, though, there's no way to escape your dependency on.. well.. your dependencies.

----

As to the specific problem you mentioned in your reply -- testing a CC system without actually shipping 10,000 widgets to Sri Lanka -- design by contract is your friend.

Yes, your high-level component relies on low-level systems to fulfill its contract. So to unit-test that high-level component, set it into a test harness where all the low-level systems are fakes that, by design, either meet or violate the required contract. Your unit tests will confirm that the high-level component does fulfill its contract when all its dependencies fulfill theirs, and that it fails in the required way when some or all of its dependencies fail. The problem of making sure the component works properly in the live system will then fall under the heading of integration testing, not unit testing.

----

As to your objection that mock objects can end up being as numerous and as large as the actual system, you're on the right track, but you haven't learned to love the idea yet.

Instead of thinking of mock objects as an annoying adjunct to proper testing, think of them as a living design reference. Build the mock object first, and run it through its paces before building the actual production object. In fact, you should build a whole family of mock objects, starting with the simplest, no-dependencies-canned-return-value version possible, and working up through all the contractual rights and obligations the production object will have to observe. For every condition in the contract, write objects that both meet and violate that condition, so you can be sure all the bases are covered.

What you're really doing there is working out your structural code first -- the stuff that keeps all the modules working and playing nicely together -- and putting the actual data manipulation off for last. That isn't as much fun as slamming down a 1.5 lethal dosage of caffiene and blasting your way through a 40-hour hack session, but it does tend to produce better-structured code.

In the end, you may end up with far more mock-code than production code, but the mock code will be easier to build and evolve, and every assumption in the production code will be directly testable from one of the mock units. In effect, you do all your debugging during the design stage, and by the time you get to production code, there's nothing left to test.

----

BTW - a 'smoke test' is a live test in a (presumably) safe environment: plug it in, turn it on, and see if it starts to smoke.

[reply]

Re: Regression testing with dependencies
by PodMaster (Abbot) on Aug 22, 2003 at 09:27 UTC

comment tests which may fail due to known resource unavailability

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

[reply]

Re: Regression testing with dependencies

by ferrency (Deacon) on Sep 04, 2003 at 15:39 UTC

When you're writing a module whose entire existence is intended to connect to some third party's server using a proprietary networking protocol, you can't very well skip every test, because then you're not testing anything at all.

Writing a fake server or dummy module to connect to also doesn't help, because then you're testing against how you think the protocol/server works, not how it actually works. You're making the same assumptions you made when you wrote the module you're testing.

It's roughly equivalent to doing this:

sub protocol_thing1 {
   return "thing1"; # protocol-dependant Magic String
}

# And a corresponding test:
use Test::More tests => 1;

ok protocol_thing1() eq "thing1";

# Gee, that was fun!
# The problem is, the server throws an error unless 
# you use "Thing 1".
[download]

Alan

[reply]
[d/l]

Re2: Regression testing with dependencies

by dragonchild (Archbishop) on Sep 04, 2003 at 16:52 UTC

You're making the same assumptions you made when you wrote the module you're testing.

One way around this is to do what large corporations do - have someone else write the tests. The developer provides the tester with the design specifications and nothing else. In fact, it's better if the developer and tester never speak of the specifications at all, to avoid contamination. Then, the tester writes tests to exercise the specification, not the component.

That last is so important, I'll say it again -
The tester writes tests to exercise the specification, not the component.

Remember - the specification is everything. The component implements it and the tester tests against it. It is your whole world.

Component testing has another name - unit-testing. Testing the implementation against the specification is something that the developer cannot do.

Now, this doesn't mean you have to have an entire testing team. At one position, we did cross-testing. I develop something, then hand another developer the specification I developed against. Based solely on that, the other deveveloper tests my implementation. You'd be surprised how many bugs are found with just this simple solution. Only when the cross-tester ok's the implementation is the test team even notified that the specification has been implemented and is ready for their testing.

Automated testing, imho, should be built from the test-team's tests. When I worked as a test-tool supporter for Motorola, the developers did unit- and cross-testing. (They may have also done some regression testing, but I wasn't aware of that.) It was then handed to the test team. The test team did both regression and integration testing. Once that was completed, a subset of the integratration tests was chosen and added to the regression test suite. We also had a regression test team that was involved in doing the complete regression suite. (When I was there over two years ago, the regression suite for any version of Motorola's BTS software exceeded 10k tests. A full automated run took almost 80 hours. I'm sure it's larger now.)

------
We are the carpenters and bricklayers of the Information Age.

The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

[reply]


P is for Practical
	PerlMonks