![]() |
|
Do you know where your variables are? | |
PerlMonks |
[RFC] Discipulus's step by step tutorial on module creation with tests and git -- second partby Discipulus (Canon) |
on Dec 19, 2018 at 11:16 UTC ( #1227455=note: print w/replies, xml ) | Need Help?? |
day four: the PODist and the coderstep 1) the educated documentationWe get up in the morning and we suddenly realize yesterday we forgot something very important: documentation! Good documentation is like an educated people, while poor documentation is like a boor one: who do you prefer to meet? The same is for your module users: they hope and expect to find a good documentation and to write it is our duty. Dot. Documentation content, in my little experience, can be impacted a lot for even small changes in the code or interface so, generally I write the larger part of the docs when the implementation or interface is well shaped. But, by other hand, a good approach is to put in the docs every little statement that will be true since the very beginning of your module development. At the moment we can state our validate sub accepts both strings and ranges and always returns an array. At the moment the relevant part of the POD documentation is:
We do not plan to export functions: our sub must be called via its fully qualified name, as we do in the test we created: Range::Validator::validate() so we can delete the EXPORT part and add something in the subroutines part:
We do not need =cut anymore because we do not have POD in blocks interleaved with the code, but an unique POD block in the __DATA__ section. step 2) git status again and commit againSince we are now very fast with git commands, let's commit this little change; the push to the remote repository can be left for the end of work session. So status (check it frequently!) and commit
step 3) more code...Now it's time to add more checks for the incoming string: we do not accept a lone dot between non dots, nor even more than two dots consecutively:
The whole sub now look like:
step 4) ...means more and more testsNow it is time to test this behaviour. Go edit ./t/01-validate.t adding (append the following code to the end of the test file) some dies_ok statements preceded by a note:
Run the test via prove
Fine! But.. to much repetitions in the test code. Are not we expected to be DRY (Dont Repeat Yourself)? Yes we are and since we have been so lazy to put use Test::More qw(no_plan) we can add a good loop of tests (replace the last two dies_ok with the following code in the test file):
Run the test again:
FAIL? Fortunately we shown the current range passed in the text generated by the test, so go examine it: not ok 7 - expected to die with a lone dot in range [.] Spot why it fails (ie it does not die as expected)? Our regex to check a lone dot is: /[^.]+\.{1}[^.]+/ And it reads (as per the YAPE::Regex::Explain output): any character except: . (1 or more times (matching the most amount possible)) followed by . (1 times) followed by any character except: . (1 or more times (matching the most amount possible)) Which is simply not true for the given string '.' So we try changing both plus signs with question mark quantifiers in the regex: it does not help. As a wise friend explains, we need lookaround: /(?<!\.)\.(?!\.)/ will work! So we change the check in the module like follow:
As we spot this edge case we add two similar ones to the test:
Also the next regex (aimed to search for three dots) in the module is not working for the very same reason; change it from /[^.]+\.{3}/ to simply /\.{3}/ The moral? Tests are your friends! We spot, by hazard, an edge case and our code must be able to deal with it, so free as much your fantasy writing your tests. Cockroaches come from box corners.. ops, no I mean: bugs come from edge case. Now we add some test to spot, and die, if three dots are found, with the new simpler regex /\.{3}/ so we change the code adding the following code to the test:
We run the test:
So now your sub is:
And our test file ./t/01-validate.t as follow:
step 5) git: a push for two commitsTime to review the status of the local repository, commit changes and push it online:
Today we committed twice, do you remember? first time just the POD we added for the sub and second time just few moments ago. We pushed just one time. What's really now in the online repository? Go to the online repository, Insights, Network: the last two dots on the line segment are our two commits, pushed together in a single push. Handy, no? Click on the second-last dot and you will see the detail of the commit concerning the POD, with lines we removed in red and lines we added in green. Commits are free: committing small changes frequently is better than commit a lot of changes all together. day five: deeper testsstep 1) more validation in the codeToday we plan to add two new validations in our sub: first one intended to be used against ranges passed in string form, the second to all ranges, before returning them. The constraint for the string form is about reversed ranges, like 3..1 and it is added just after the last croak we added yesterday:
Now is important that we take the habit to commit on our own, whenever we add an atomic piece of code: so go to commit! From now on not every git operation will be shown with full output, only important ones (is this a git guide? No!). The other one, applied before returning the range as array, is about overlapping ranges: (0..2,1) that is equivalent to 0,1,1,2 with a nasty repetition terrible for the rest of the code outside the present module (this assumption is related to our current, fictional, scenario). So, just before returning from the sub, we simply use a hash to have unique elements in the resulting array:
As previously said, commit on your own, with a meaningful comment. You end with:
New features are worth to be pushed on the online repository: you know how can be done. Do it. step 2) a git excursusDid you follow my small advices about git committing and meaningful messages? If so it's time to see why is better to be diligent: with git log (which man page is probably longer than this guide..) you can review a lot about previous activities:
This is definitevely handy. HEAD is where your activity is focused in this moment. Try to remove the --oneline switch to see also all users and dates of each commit. As you can understand git is a vaste world: explore it to suit your needs. This is not a git guide ;) step 3) add deeper testsUntil now we used a limited test arsenal: ok from Test::Simple use_ok and note from Test::More dies_ok from Test::Exception As we added a croak in our sub to prevent reversed ranges, we can use dies_ok again to check such situations (append the following code to our test file):
Commit at your will. Test::More has a lot of useful testing facilities (review the module documentation to take inspiration) and now we will use is_deeply to implement some positive test about expected and returned data structures. This is useful, in our case, to test that overlapping ranges or unordered ones are returned corrected. To do this we can use a hash of inputs and their expected returned values (append the following code to our test file):
Last two tests we added will produce the following output:
Commit. step 4) who is ahead? git branches and logJust at glance: up to you to explore this topic. Look at the following git session of commands, two commands ( one that we never used until now) just before and reissued just after the push:
step 5) overall check and releaseWe just need some small change and our module will be ready for production. We left some part of our code behind and precisely the else part dedicated to incoming arrays:
and we can just fill it with @range = @_; and commit the change. But if the above is true, we have to move all string check inside the if ( @_ == 1) {... block! Do it and commit the change. Now our sub is like the following one:
Is our change safe? Well we have a test suit: prove -l -v will tell you if the change impacts the test suit (if tests are poor you can never be sure). Now our module is ready for production. It just lacks of some good documentation. Not a big deal, but is our duty to document what the sub does effectively. Add to the POD of our sub:
passed as string will also cause an exception. In both string and list form any duplicate element (overlapped range) will be silently removed. Any form of unordered list will be silently rerodered. </code> Check git status and commit. Use git log HEAD --oneline to see that the local repository is three steps ahead of the remote one. Push the changes in the online repository. Use git log HEAD --oneline again to see what happened. step 6) test list formEven if it is simpler we have to test the array form of our sub. We can use this time an array of tests each element being another array with two elements: first the list we pass to the sub, then the list we expect back from the sub. Again using is_deeply is a good choice: Add the following test to our file 01-validate.t
Run the test: we reached the big number of 32 succesful test! Congratulations! As always, commit the change with a meaningful comment and push this important set of changes to the online repository. day six: testing STDERRstep 1) the problem of empty listsOur assumptions, in "day zero - the plan", were to accept only ordered, not overlapped lists or their string representations. Other software in the project (again: a fictional scenario), where our validation module is used, blindly pass what received from outside (many different sources) to our validate sub. With the output produced by our sub many other subs or methods are called. All these software (out of our control) assume that, if an empty list is received then ALL elements are processed. This seemed the right thing to do. After the advent of our module some example usage can be:
Right? The module goes in production and 98% of errors from the foreign part of the code base disappeared. Only 98%? Yes.. Miss A of department Z call your boss in a berserk state: not all their errors are gone away. They use the list form but Miss A and the developer B are sure no empty lists are passed to your validate sub. You call the developer B, a good fellow, who explain you that list are generated from a database field that cannot be empty (NOT NULL constraint in the database): You - Listen B, if I emit a warning you'll be able to trap which list generated from the database provoked it? B - Sure! Can you add this? You - Yes, for sure. I can use a variable in Range::Validator namespace, let's name it warnings and you'll set it to a true value and only you, and not the rest of the company, will see errors on STDERR. Ok? B - Fantastic! I'll add the variable as soon as you tell me. You - Ok, but then I want to know which list provoked the error, right? For a coffee? B - Yeah man, for a coffee, as always. step 2) adding a Carp to the lakeSo we add a line in the top of the module, just after VERSION: our $WARNINGS = 0; to let dev B to trigger our warnings. We commit even this small change. Then we add to the sub a carp call triggered if our $WARNINGS == 1; and if @_ == 0 and we add this as elsif condition:
Git status, git commit on your own. step 3) prepare the fishing road: add a dependency for our testTo grab STDERR in test we have to add a dependency to Capture::Tiny module which is able, with its method capture to catch STDOUT STDERR and results emitted by an external command or a chunk of perl code. Handy and tiny module. Do you remeber the place to specify a dependency? Bravo! Is in Makefile.PL and we did the same in "day three step 3" when we added two modules to the BUILD_REQUIRES hash. Now we add Capture::Tiny to this part (remeber to specify module name in a quoted string):
Commit this change. step 4) go fishing the Carp in our testNow in 01-validate.t test file we first add the module with use Capture::Tiny qw(capture) and then, at the end we add some test of the warning behaviour:
Run the test suit, commit this change. Use git log HEAD --oneline to visualize our progresses and push all recent commits to the online repository. The new version goes in production. The good fellow calls you: B - Ehy, we added your warnings.. You - And..? B - We spotted our errors in the database.. You - And which kind of error? B - Well.. do you know what perl sees when it spot 1, followed by FOUR dots, followed by 3? You - Ahh aha ah ah.. unbelievable! Yes, I suppose it parses as: from 1 to .3 aka nothing, aka empty list.. B - Exactly! Can you imagine my boss face? You - I dont want! A coffee is waiting for you. Thanks! step 5) document the new warning featureAdd some POD,few lines are better than nothing, to the module documentation:
is set to a true value then an empty list passed to validate will provoke a warning from the caller perspective. </code> Commit this change and update the online repository. day seven: the module is done but not readystep 1) sharingAs stated in "day zero - the plan" sharing early is a good principle: can be worth to ask in a forum dedicated to Perl (like perlmonks.org) posting a RFC post (Request For Comments) or using the dedicated website http://prepan.org/ to collect suggestions about your module idea and implementation. step 2) files in a CPAN distributionYour module is ready to be used and it is already used, but is not installable by a CPAN client nor can be indexed by a CPAN indexer at the moment. Read the short but complete description of possible files at What are the files in a CPAN distribution? Following tests are not needed to install or use your module but to help you spotting what can be wrong in your distribution. step 3) another kind of test: MANIFESTIn the the "day one - preparing the ground" we used module-starter to create the base of our module. Under the /t folder the program put three test we did not seen until now: manifest.t pod-coverage.t and pod.t These three tests are here for us and they will help us to check our module distribution is complete. Let's start from the first
Ok, no test run, just skipped. Go to view what is inside the test: it skips all actions unless RELEASE_TESTING environment variable is set. It also will complain unless a minimal version of Test::CheckManifest is installed. So set this variable in the shell (how to do this depends on your operating system: linux users probably need export RELEASE_TESTING=1 while windows ones will use set RELEASE_TESTING=1) and use your CPAN client to install the required module (normally cpan Test::CheckManifest is all you need) and rerun the test again:
Omg?! What is all that output? The test complains about a lot of files that are present in the filesystem, in our module folder but are not specified in the MANIFEST file. This file contains a list (one per line) of files contained within the tarball of your module. In the above output we have seen a lot, if not all, files under the .git directory. Obviously we do not want them included in our module distribution. How can we skip them? Using MANIFEST.SKIP file that basically contains regular expressions describing which files should be excluded from the distribution. So go create this file in the main folder of the module and add inside it a line with a regex saying we do not want the .git directory: ^\.git\/ and add this file with git add MANIFEST.SKIP and commit this important change. Rerun the test (added some newlines for readability):
By far better: the test points us to two files we for sure need to include in MANIFEST and precisely: MANIFEST.SKIP and t/01-validate.t Go to the MANIFEST file and add them (where they are appropriate, near similar files and paying attention to case an paths), then commit the change. If you rerun the above test you'll see files added to MANIFEST are no more present in the failure output. Let's examine the remaining two files. What is ignore.txt? It was created as default ignore list by module-starter and it contains many lines of regexes. If we want module-starter to create MANIFEST.SKIP instead, next time we'll use it specify --ignores='manifest' For the moment we can delete it. Commit. If you rerun the test you now see only /xt/boilerplate.t and, if you open it, you'll see that is just checking if you left some default text in our module, texts put by module-starter Ah! useful test: let's run it:
Ok no boilerplate garbage left. We can delete this test file and commit the change. Now, finally:
Push recent changes into the online repository. step 4) another kind of test: POD and POD coverageIn our /t folder we still have two tests we did not run: shame! module-starter created for us pod.t and pod-coverage.t The first one checks every POD in our distribution has no errors and the second ensures that all relevant files in your distribution are appropriately documented in POD documentation. Thanks for this. Run them:
step 5) some README and final review of the workThe README must contain some general information about the module. Users can read this file via cpan client so put a minimal description in it. Gihub website use it as default page, so it is useful have some meningful text. Someone generates the text from the POD section of the module. Put a short description, maybe the sysnopsis and commit the change. Push it online. Now we can proudly look at our commits history in a --reverse order:
A good glance of two dozens of commits! We have done a good job, even if with some errors: committing multiple changes in different part of the project (like in our third commit) is not wise: better atomical commits. We have also some typo in commit messages.. step 6) try a real CPAN client installationIt's now time to see if our module can be installaed by a cpan client. Nothing easier: if you are in the module folder just run cpan . and enjoy the output (note that this command will modify the content of the directory!). day eight: other module techniquesoption one - the bare bone moduleThis is option we choosed for the above example and, even if it is the less favorable one, we used this form for the extreme easy. The module is just a container of subs and all subs are available in the program tha uses our module but only using their fully qualified name, ie including the name space where they are defined: Range::Validator::validate was the syntax we used all over the tutorial. Nothing bad if the above behaviour is all you need. option two - the Exporter moduleIf you need more control over what to be available to the end user of your module Exporter CORE module will be a better approach. Go read the module documentation to have an idea of its usage. You can leverage what to export into the program using your module, so no more fully qualified package name will be needed. I suggest you to no export nothing by default (ie leaving @EXPORT empty) using instead @EXPORT_OK to let the end user to import sub from your module on, explicit, request. With the use of Exporter you can also export variables into the program using your module, not only subs. It's up to you to decide if this is the right thing to do. Pay attention with names you use and the risk of name collision: what will happen if two module export two function with the same name? Perl is not restrictive in any meaning of the word: nothing will prevent the end user of your module to call Your::Module::not_exported_at_all_sub() and access its functionality. A fully qualified name will be always available. The end user is breaking the API you provide, API where not_exported_at_all_sub is not even mentioned. option three - the OO modulePreferred by many is the Object Oriented (OO) way. OO it's not better nor worst: it's a matter of aptitude or a matter of needs. See the relevant section on the core documentation: To-OO-or-not-OO? about the choice. An object is just a little data structure that knows the class (the package) it belongs to. Nothing more complex than this. The data structure is generally a hash and its consciouness of its class (package) is provided by the bless core function. Your API will just provide a constructor (conventionally new) and a serie of methods this object can use. Again: nothing prevents end user to call one of your function by its fully qualified name as in Your::Module::_not_exported_at_all_sub() it's just matter of being polite. The core documentation include some tutorial about objects: Many perl authors nowadays use specialized modules to build up OO projects: Moose module or the lighter flavor one Moo notably. An OO module has many advantages in particular situations but is generally a bit slower than other module techniques. advanced Makefile.PL usageUntil now we modified the BUILD_REQUIRES to specify dependencies needed while testing our module and PREREQ_PM to include modules needed by our module to effectively run. The file format is described in the documentation of ExtUtils::MakeMaker where is stated that, since version 6.64 it is available another field: <code>TEST_REQUIRES</code> defined as: "A hash of modules that are needed to test your module but not run or build it". This is exactly what we need, but this force us to specify, also in Makefile.PL that we need 'ExtUtils::MakeMaker' => '6.64' in the CONFIGURE_REQUIRES hash. The 6.64 version of ExtUtils::MakeMaker was released in 2012 but you cannot be sure end users have some modern perl, so we can safely use BUILD_REQUIRES as always or use some logic to fallback to "older" functionality if ExtUtils::MakeMaker is too old. You can use WriteMakefile1 sub used in the Makefile.PL of App::EUMM::Upgrade other testing modulesIn the current tutorial we used Test::Exception to test failures: consider also Test::Fatal Overkill for simple test case but useful for complex one is the module Test::Class Other modules worth to see are in the Task::Kensho list. advanced testing codeIf in your tests you have the risk of code repetition (against the DRY - Dont Repeat Yourself principle) you can find handy to have a module only used in your tests, a module under the /t folder. You need some precautions, though. Let's assume you plan a test helper module named testhelper contained in the /t/testhelper.pm file used by many of your test files. You dont want CPAN to index your testhelper module and, to do so, you can put a no_index directive into your main META.yml or META.json file (or both). You can also use a trick in your package definition inserting a newline between package declaration and the name of the package. Like in:
In all tests you want to use that helper module you must:
bibliographyCORE documentation about modules
CORE documentation about testing
further readings about modules
further readings about testing
acknowledgementsAs all my works the present tutorial would be not possible without the help of the perlmonks.org community. Not being an exhaustive list I want to thanks: Corion, choroba, Tux, 1nickt, marto, hippo, haukex, Eily, eyepopslikeamosquito, davido and kschwab (to be updated ;)
In Section
Meditations
|
|