perlmeditation
eyepopslikeamosquito
<P>
I'll be giving a talk at work about improving
our test automation.
Initial ideas are listed below.
Feedback on talk content and general approach are welcome
along with any automated testing anecdotes you'd like to share.
Possible talk sections are listed below.
</P>
<P><B>Automation Benefits</B></P>
<P>
<ul>
<li> Reduce cost.
<li> Improve testing accuracy/efficiency.
<li> Regression tests ensure new features don't break old ones. Essential for continuous delivery.
<li> Automation is essential for tests that cannot be done manually: performance, reliability, stress/load testing, for example.
<li> Psychological. More challenging/rewarding. Less tedious. Robots never get tired or bored.
</ul>
</P>
<P><B>Automation Drawbacks</B></P>
<P>
<ul>
<li> Opportunity cost of not finding bugs had you done more manual testing.
<li> Automated test suite needs ongoing maintenance. So test code should be well-designed and maintainable; that is, you should avoid the common pitfall of <I>"oh, it's only test code, so I'll just quickly cut n paste this code"</I>.
<li> Cost of investigating spurious failures. It is wasteful to spend hours investigating a test failure only to find out the code is fine, the tests are fine, it's just that someone kicked out a cable. This has been a chronic nuisance for us, so ideas are especially welcome on techniques that reduce the cost of investigating test failures.
<li> May give a false sense of security.
<li> Still need manual testing. Humans notice flickering screens and a white form on a white background.
</ul>
</P>
<P><B>When and Where Should You Automate?</B></P>
<P>
<ul>
<li> Testing is essentially an economic activity. There are an infinite number of tests you could write. You test until you cannot afford to test any more. Look for value for money in your automated tests.
<li> Tests have a finite lifetime. The longer the lifetime, the better the value.
<li> The more bugs a test finds, the better the value.
<li> Stable interfaces provide better value because it is cheaper to maintain the tests. Testing a stable API is cheaper than testing an unstable user interface, for instance.
<li> Automated tests give great value when porting to new platforms and when upgrading existing ones.
<li> Writing a test for customer bugs is good because it helps focus your testing effort around things that cost you real money and may further reduce future support call costs.
</ul>
</P>
<P><B>Adding New Tests</B></P>
<P>
<ul>
<li> Add new tests whenever you find a bug.
<li> Around code hot spots and areas known to be complex, fragile or risky.
<li> Where you fear a bug. A test that never finds a bug is poor value.
<li> Customer focus. Add new tests based on what is important to the customer. For example, if your new release is correct but requires the customer to upgrade the hardware of 1000 nodes, they will not be happy.
<li> Documentation-driven tests. Go through the user manual and write a test for each example given there.
<li> Add tests (and refactor code if appropriate) whenever you add a new feature.
<li> Boundary conditions.
<li> Stress tests.
<li> Big ones, but not too big. A test that takes too long to run is a barrier to running it often.
<li> Tools. Code coverage tools tell you which sections of the code have not been tested. Other tools, such as static (e.g. lint, Perl::Critic) and dynamic (e.g. valgrind) code analyzers, are also useful.
</ul>
</P>
<P><B>Test Infrastructure and Tools</B></P>
<P>
<ul>
<li> Single step, automated build and test. Aim for continuous integration.
<li> Clear and timely build/test reporting is essential.
<li> Keep metrics (via test metadata, say) on the test suite itself. Is a test providing "value". How often does it fail validly? How often does it fail spuriously? How long does it take to run?
<li> Aim for around 80% code coverage (for most applications 100% code coverage is not worth it).
<li> It's vital to quarantine intermittently failing tests quickly and to fix them quickly ... only returning them to the main build when reliable (if you don't do that, people start ignoring test failures!). No [wp://Broken windows theory|broken windows].
<li> Make it easy to find and categorize tests. Use test metadata.
<li> Integrate automated tests with revision control, bug tracking, and other systems, as required.
<li> Divide test suite into components that can be run separately and in parallel. Quick test turnaround time is crucial.
</ul>
</P>
<P><B>Design for Testability</B></P>
<P>
<ul>
<li> It is easier/cheaper to write automated tests for systems that were designed with testability in mind in the first place.
<li> Interfaces Matter. Make them: consistent, easy to use correctly, hard to use incorrectly, easy to read/maintain/extend, clearly documented, appropriate to audience, testable in isolation.
<li> [wp://Dependency Injection] is perhaps the most important design pattern in making code easier to test.
<li> <a href="http://en.wikipedia.org/wiki/Mock_object">Mock Objects</a> are frequently useful and are broader than unit tests - for example, a mock server written in Perl (e.g. a mock SMTP server) to simulate errors, delays, and so on.
<li> Consider ease of support and diagnosing test failures during design.
</ul>
</P>
<P><B>Test Driven Development (TDD)</B></P>
<P>
<ul>
<li> Improved interfaces and design. Especially beneficial when writing new code. Writing a test first forces you to focus on interface - from the point of view of the user. Hard to test code is often hard to use. Simpler interfaces are easier to test. Functions that are encapsulated and easy to test are easy to reuse. Components that are easy to mock are usually more flexible/extensible. Testing components in isolation ensures they can be understood in isolation and promotes low coupling/high cohesion. Implementing only what is required to pass your tests helps prevent over-engineering.
<li> Easier Maintenance. Regression tests are a safety net when making bug fixes. No tested component can break accidentally. No fixed bugs can recur. Essential when refactoring.
<li> Improved Technical Documentation. Well-written tests are a precise, up-to-date form of technical documentation. Especially beneficial to new developers familiarising themselves with a codebase.
<li> Debugging. Spend less time in [id://11127878|crack-pipe debugging sessions]. When you find a bug, add a new test before you start debugging (see practice no. 9 at <a href="https://www.perl.com/pub/2005/07/14/bestpractices.html/">Ten Essential Development Practices</a>).
<li> Automation. Easy to test code is easy to script.
<li> Improved Reliability and Security. How does the code handle bad input?
<li> Easier to verify the component with [id://11116639|memory checking and other tools].
<li> Improved Estimation. You've finished when all your tests pass. Your true rate of progress is more visible to others.
<li> Improved Bug Reports. When a bug comes in, write a new test for it and refer to the test from the bug report.
<li> Improved test coverage. If tests aren't written early, they tend never to get written. Without the discipline of TDD, developers tend to move on to the next task before completing the tests for the current one.
<li> Psychological. Instant and positive feedback; especially important during long development projects.
<li> Reduce time spent in System Testing. The cost of investigating a test failure is much lower for unit tests than for complex black box system tests. Compared to end-to-end tests, unit tests are: fast, reliable, isolate failures (easy to find root cause of failure). See also <a href="https://martinfowler.com/bliki/TestPyramid.html">Test Pyramid</a>.
</ul>
</P>
<P><B>Test Doubles</B></P>
<P>
<ul>
<li> <B>Dummy objects</B> are passed around but never actually used. Usually they are just used to fill parameter lists.
<li> <B>Fake objects</B> actually have working implementations, but usually take some shortcut which makes them not suitable for production (an InMemoryTestDatabase for example).
<li> <B>Stubs</B> provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed for the test.
<li> <B>Spies</B> are stubs that also record some information based on how they were called; for example an email service that records how many messages were sent.
<li> <B>Mocks</B> are pre-programmed with expectations which form a specification of the calls they are expected to receive; they can throw an exception if they receive a call they don't expect and are checked during verification to ensure they got all the calls they were expecting. Note that only mocks insist upon behavior verification. The other doubles can, and usually do, use state verification. Mocks behave like other doubles during the exercise phase because they need to make the SUT (System Under Test) believe it's talking with its real collaborators - but mocks differ in the setup and the verification phases. While mocks are valuable when testing side-effects, protocols and interactions between objects, note that overuse of mocks inhibits refactoring due to tight coupling between the tests and the implementation (instead of just the interface contract).
</ul>
</P>
<P>
See also:
</P>
<P>
<ul>
<li> <a href="https://martinfowler.com/articles/mocksArentStubs.html">Mocks aren't Stubs article by Martin Fowler</a> (mockists vs classicists, classic: use real objects if possible and a double if it's awkward to use the real thing, use state verification; vs mockist: always use a mock for any object with interesting behavior, use behavior verification (mocks are pre-programmed with expectations, a specification of the calls they are expected to receive, verification ensures they got all the calls they were expecting).
<li> <a href="https://rubyplus.com/articles/401-Stubs-are-not-Mocks-Concise-Version-of-Martin-Fowler-s-Article">Concise version of Fowler mocks arent stubs</a>
<li> <a href="http://www.peterprovost.org/blog/2012/04/15/visual-studio-11-fakes-part-1/">Visual Studio 11 Fakes, Stubs and Shims (run-time method interceptors)</a> (Stub: State-based verification "Arrange, Act, Assert"; Mock: behavior-based verification: A mock provides not only a fake implementation but also logic for verifying how calls were made on the fake. When you are testing side-effects, protocols and interactions between objects, they are extremely valuable. Some folks fall into behavior verification when none is needed)
<li> <a href="http://xunitpatterns.com/Test%20Double.html">Test double flavours</a> (Test Stub, Test Spy, Mock Object, Fake Object, ...)
<li> <a href="https://stackoverflow.com/questions/3459287/whats-the-difference-between-a-mock-stub">What is the difference between a mock and a stub</a> (stack overflow)
<li> <a href="https://pythonspeed.com/articles/verified-fakes/">verified fakes in python</a>
</ul>
</P>
<P><B>Testing Memory and Threads</B></P>
<P>
<ul>
<li> [id://11116639] (long list of nodes on memory checking and other code analysis tools)
<li> [id://601860] -- see "Unit Testing Concurrent Code" section
</ul>
<P>
<ul>
<li> [wp://AddressSanitizer]
<li> <a href="https://clang.llvm.org/docs/ThreadSanitizer.html">Clang ThreadSanitizer</a>
</ul>
</P>
<P>
<ul>
<li> [wp://Race condition] (wikipedia)
<li> [wp://Heisenbug] (wikipedia)
<li> [wp://lsof] (wikipedia)
</ul>
</P>
<P><B>Testing Tools</B></P>
<P>
<ul>
<li> [wp://Google Test] (wikipedia)
<li> <a href="https://github.com/google/googletest">Google Test and Google Mock</a> (github)
<li> <a href="https://github.com/catchorg/Catch2">Catch2 Testing Framework</a> (github)
<li> [id://11150656] (2023) - includes an example C++ unit test using Catch2
<li> [id://11150945] (2023) - includes C++ example code and building and using Google's Abseil library
<li> [wp://Clang] and [wp://LLVM] (contain many useful tools such as AddressSanitizer and ThreadSanitizer)
</ul>
</P>
<P><B>Test Anything Protocol (TAP)</B></P>
<P>
<ul>
<li> [wp://Test Anything Protocol] (wikipedia)
<li> <a href="https://testanything.org/">Test Anything Protocol</a> (testanything.org)
<li> [id://596760] by [petdance] - <I>The reason I wrote prove was so that I could test PHP scripts</I>
</ul>
</P>
<P><B>Types of Testing</B></P>
<P>
<ul>
<li> Static testing. Code review by humans and static code analysers (e.g. lint, Perl::Critic).
<li> Passive testing. Contrary to active testing, testers do not provide any test data, just examine system logs and traces.
<li> Dynamic testing. Unit tests, Integration tests, System tests, Acceptance tests, ...
<li> Dynamic program analysis. e.g. Purify, Valgrind, ThreadSanitizer, ...
<li> Exploratory testing. Simultaneous learning, test design and test execution.
<li> Performance testing. Stress testing. Load testing.
<li> Usability testing.
<li> Regression testing.
<li> Acceptance testing.
<li> End-to-end testing.
<li> Security testing.
<li> Equivalence partitioning.
<li> Critical path testing.
<li> Failover testing.
<li> Internationalization testing.
<li> Smoke testing.
<li> Alpha, Beta testing.
<li> ... and many more :)
</ul>
</P>
<P><B>References Added Later</B></P>
<P>
<ul>
<li> [id://536384]
<li> [id://11130654] (2021 response to [Bod])
<li> [id://11150002] (2023 response to [Bod])
<li> [id://11152808] (2023 response to [Bod])
<li> [id://11149671] by [cavac] (2023) - aircraft crash caused by [wp://Confirmation bias] (example of the danger of ignoring computer warnings)
</ul>
</P>
<P>
<ul>
<li> [id://355203]
<li> [id://361821]
<li> [id://465753]
<li> [id://997301]
</ul>
</P>
<P>
<ul>
<li> [id://11158186] by [SankoR] (2024) - endorses [mod://Test2::V0] (moving large code base with existing test suite from perl <C>v5.10.0</C> to <C>v5.38.2</C>)
</ul>
</P>
<P>
<ul>
<li> [id://11149996] by [Bod] (2023)
<li> [id://11152604] by [bliako] (2023)
<li> [id://11150918] by [stonecolddevin] (2023) - argues against strict adherence to TDD
<li> [id://11151004] by [choroba] (2023) - testing anecdote from early in his career, see also [id://1124000]
<li> [id://11152787] by [hv] (2023) - if something is difficult to write tests for, maybe it's the interface that should change
</ul>
</P>
<P>
<ul>
<li> [id://11150656] by me (2023) (Matchers are essential when unit testing statically typed languages) (2023)
<li> [id://11135668] by me (2021) (prefer <C>cmp_ok</C> to <C>ok</C> because you get clearer diagnostics when a test fails)
<li> [id://11129243] by [LanX] and me (2021)
</ul>
</P>
<P>
<ul>
<li> [id://11150485] (discusses testing vs remotely supporting a module) by [Bod] (2023)
<li> [id://11128001] by me (2021) (example of remotely supporting products that run on many customer machines)
</ul>
</P>
<P>
<ul>
<li> [id://11102443] by [hippo] (2019)
<li> [id://1195817] by [talexb] (2017)
<li> [id://1227454] by [Discipulus] (2018)
<li> [id://945069] by [nbezzala] (2011)
</ul>
</P>
<P>
<ul>
<li> [id://11146368] by [davies] (2022)
<li> [id://1072921] by [tizatron] (2014)
<li> [id://1072954] by me (2014)
</ul>
</P>
<P>
<ul>
<li> [id://856415]
<li> [id://11127878] (on debuggers)
<li> [id://48629] by [merlyn] (aggressive approach to rewriting unmaintainable code)
</ul>
</P>
<P>
<ul>
<li> [id://1226132] by [davido] (2018)
<li> [id://1178489] (2016) - an example of a Perl syslog server (using [doc://IO::Select]) used during automated smoke testing
</ul>
</P>
<P>
<ul>
<li> [id://427823] by [talexb] (2005)
<li> [id://428047] by [xdg] (2005 - what to put in each .t file, e.g. a .t file might cover a [wp://Use case])
<li> [id://427899] by [dragonchild] (2005 - testing with mock objects)
<li> [id://500174] by me (2005 - I like to run all my tests both in normal mode and taint mode ... plus test in "persistent" environments, such as mod_perl)
<li> [id://496867] by [Perl_Mouse] (2005)
<li> [id://596305] by [Tanktalus] (2007)
</ul>
</P>
<P><B>CPAN Testing Tools</B></P>
<P>
<ul>
<li> [id://11106010] by [hippo]
</ul>
</P>
<P>
<ul>
<li> [mod://Test::More] - classic Perl testing framework
<li> [mod://Test::Harness]
<li> [mod://Test::Class]
<li> [mod://Test::Deep]
<li> [mod://Test::Most]
</ul>
</P>
<P>
<ul>
<li> [mod://Test2::Suite] - the most recent and modern set of tools for testing
<li> [mod://Test::Script]
<li> [mod://Test2::V0]
<li> [mod://Test2::API]
</ul>
</P>
<P>
<ul>
<li> <a href="https://toby.ink/blog/2023/01/24/perl-testing-in-2023/">Perl Testing in 2023</a> by [tobyink] (blog)
<li> <a href="https://perlmaven.com/getting-started-with-test2">Getting started with Test2</a> (perlmaven)
<li> [id://1185203] by [stevieb] (2017) - no replies
<li> [id://11155602] by [choroba] (2023) - examples using <C>Test2::V0</C>
<li> [id://11137357] by [choroba] (2022) - ditto
<li> [id://11145243] by [tobyink] (2022) - ditto
</ul>
</P>
<P>
<ul>
<li> [id://11156894] by [Bod] (2024) - see [id://11156898|response] from [choroba] (modern Perl testing: [mod://Test::Deep], [mod://Test2::V0], ...)
</ul>
</P>
<P><B>General References</B></P>
<P>
<ul>
<li> [wp://Software testing] (wikipedia)
<li> [wp://Software quality] (wikipedia)
<li> [wp://Static program analysis] (wikipedia)
<li> [wp://Dynamic program analysis] (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Dynamic_testing">Dynamic testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Unit_testing">Unit testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Exploratory_testing">Exploratory testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/White-box_testing">White box testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Black-box_testing">Black box testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Graphical_user_interface_testing">GUI testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Equivalence_partitioning">Equivalence partitioning</a> (wikipedia)
<li> [wp://Test double] (wikipedia)
</ul>
</P>
<P>
<ul>
<li> <a href="https://en.wikipedia.org/wiki/List_of_unit_testing_frameworks">List of Unit Testing Frameworks</a> (wikipedia)
<li> [wp://Hamcrest] (wikipedia) (pioneered <I>assertion matchers</I> - <I>Hamcrest</I> is an anagram of <I>matchers</I>)
</ul>
</P>
<P>
<ul>
<li> <a href="https://en.wikipedia.org/wiki/Security_testing">Security testing</a> (wikipedia)
<li> [wp://Penetration test] (wikipedia)
<li> [wp://Dynamic application security testing] (wikipedia)
<li> [wp://Static application security testing] (wikipedia)
</ul>
</P>
<P>
<ul>
<li> <a href="https://en.wikipedia.org/wiki/Test_automation">Test Automation</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Regression_testing">Regression testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Data-driven_testing">Data-driven testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Keyword-driven_testing">Keyword-driven testing</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Broken_windows_theory">Broken window theory</a> (wikipedia)
</ul>
</P>
<P>
<ul>
<li> <a href="https://en.wikipedia.org/wiki/Continuous_integration">Continuous integration</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Continuous_delivery">Continuous delivery</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Continuous_deployment">Continuous deployment</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Dependency_injection">Dependency Injection</a> (wikipedia)
<li> <a href="https://en.wikipedia.org/wiki/Mock_object">Mock Object</a> (wikipedia)
<li> [cpan://Test::MockObject] by [chromatic]
</ul>
</P>
<P>
<ul>
<li> <a href="http://www.stickyminds.com/article/when-should-test-be-automated">When should a test be automated?</a> by Brian Marick
<li> <a href="https://www.atlassian.com/test-automation?tab=test-automation-basics">Atlassian test automation</a>
<li> <a href="https://www.atlassian.com/continuous-delivery/software-testing/types-of-software-testing">Atlassian types of testing</a>
<li> <a href="http://support.smartbear.com/articles/testcomplete/manager-overview/">Why Automated Testing?</a> (Smartbear)
</ul>
</P>
<P><B>Related References</B></P>
<P>
<ul>
<li> [id://553487]
<li> <a href="https://www.perl.com/pub/2005/07/14/bestpractices.html/">Ten Essential Development Practices</a> by Damian Conway
</ul>
</P>
<P>
<ul>
<li> [id://461311] (long list of references on coding standards)
<li> [id://560637] (long list of references on dealing with legacy code)
<li> [id://417902] (long list of Security references)
<li> [id://11137640]
</ul>
</P>
<P>
<small>
Updated: many extra references were added long after the original node was written.
2019: Added <I>Test Doubles</I> section. 2021: Added <I>Types of Testing</I> section.
2023: Added links to C++ examples using Catch2 and Google Abseil library.
</small>
</P>