Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Prefer Pure Perl Core Modules

by Leitz (Scribe)
on Jul 13, 2021 at 14:03 UTC ( #11134957=perlmeditation: print w/replies, xml ) Need Help??

"I prefer to use pure Perl core modules instead of depending on the CPAN."

Saying that on IRC usually causes a flurry of negative comments. I agree that the CPAN is resource intense, and I've put stuff there. If you use lots of code from the CPAN, I'm not going to make fun of you. However, I would ask that you give me the same respect. Here's why I choose this path.

1. Compatibility Pure Perl modules are portable, the target node doesn't need a set of compiler tools. While XS based modules might improve performance, not all nodes have compiler tools. Some nodes are precluded from having compilers based on resources or security mandates. Depending on a module that may not be installable creates a production risk.

2. Idempotence Well engineered software can be installed and removed cleanly. When I worked for a telco, we would install, remove, and then reinstall software during QA. If it failed at any of those, it failed. Period. There is no "cpan" command to uninstall a module. The "cpanm" -U option for "uninstall" is marked "EXPERIMENTAL". The few times I've tried to use it to install modules that were installed via "cpan", it could not find the modules. Even when I told the command where the module was. If a module cannot be installed and removed cleanly then it does not belong on a production system.

3. Upgradeability The "cpan" -u option (upgrade) comes with the warning: "Blindly doing this can really break things, so keep a backup." This conflicts with the concept of keeping software current to reduce security vulnerabilities and bugs. If a system's software cannot be cleanly upgraded it should not be in production.

4. Security -- See Addendum below -- One of Perl's common object oriented modules is Moose. Installing Moose adds roughly 900 modules to the node. Who is security checking all those dependencies? Who wants to explain each and every one of those modules to a security auditor? In truth, how many of us could explain the risks and benefits of all nine hundred dependencies? And are we being paid to check someone else's code or are we paid to keep a production system running?

5. Immiscibility Most Linux distributions require some version of Perl for operation. This sounds good for Perl, until you realize that the versions are often very out of date. If you want to use a semi-recent Perl you usually have to compile your own and install it somewhere. You also have to install any CPAN modules separately, which means your backups are now taking longer and you have more to sift through when trying to make space. And, of course, anyone who wants to use your code has to concoct the same environment.

Solution

My personal solution is to use pure Perl core modules or pure Perl CPAN modules that do not have a large dependency list. Large, in this sense, is "Am I willing to deal with these dependencies manually?" At some point in time I hope to be Perl-smart enough to help improve CPAN, but I'm not there yet.

Addendum

An earlier version of this page referenced YAML::Tiny in the security section. Investigating, based on hippo's comment (below) about "suggested_options" ('suggests_policy' in MyConfig.pm) removed YAML::Tiny as a culprit. YAML::Tiny has no non-core module dependencies.

Chronicler: The Domici War (domiciwar.net)

General Ne'er-do-well (github.com/LeamHall)

Replies are listed 'Best First'.
Re: Prefer Pure Perl Core Modules
by eyepopslikeamosquito (Bishop) on Jul 14, 2021 at 01:13 UTC

      Yes and no; mostly yes. I'm building an application to build my Perl skills, and one of the design goals is reduced complexity so someone else could use it. Building the app is pushing me to evaluate the trade-offs and giving me a chance to implement the good practices I've been exposed to for the past few decades. All five points are based on failures I have dealt with.

      There are a lot of development teams and environments, and a lot of different assumptions and expectations. For me, not depending on a lot of CPAN modules really works well. For others CPAN is quite useful.

      Chronicler: The Domici War (domiciwar.net)

      General Ne'er-do-well (github.com/LeamHall)

        «…I'm building an application to build my Perl skills…

        A strange approach, with all respect. If I understood it right. I builded applications because I was in it for the money. Becoming better was rather a side effect. And a question of my professional ethics and my technical interests. Customers don’t really care about this. Best regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

Re: Prefer Pure Perl Core Modules
by stevieb (Canon) on Jul 13, 2021 at 14:27 UTC

    Are you talking from the perspective of writing CPAN distributions, or simply a user of the Perl language writing scripts for your own use?

    For the former, I definitely try to minimize my use of external dependencies where possible and practical. As a user though, I don't care whatsoever. Whatever makes my job easier to get the results I want from the script I'm writing dependably, I use whatever is available. I don't care if things have to be compiled or not. Several of my own CPAN distributions are C/C++/XS based, for various reasons (speed is but a single one of them).

    With the likes of perlbrew/berrybrew, adding/removing/changing perls and modules is trivial, and has no effect on system components whatsoever. At least with berrybrew (unsure in perlbrew), you can simply copy a full blown Perl installation to another system by copying a folder, and voila, it works just dandy with no recompilation needed at all (in fact, that's my next update to the software... an import/export feature to do exactly that, instead of the manual process it is now).

      Good point! If I'm going to share the code then I try to minimize the dependencies as much as possible. Because I try to keep my "production" and "personal" habits unified (mostly due to lack of brains), I will often minimize dependencies in all my code.

      Chronicler: The Domici War (domiciwar.net)

      General Ne'er-do-well (github.com/LeamHall)

Re: Prefer Pure Perl Core Modules
by hippo (Bishop) on Jul 13, 2021 at 15:23 UTC
    I like using YAML, and thought YAML::Tiny would be a light-weight option. Sadly, it required more than a hundred dependencies to be installed. ... My YAML::Tiny example has been brought under question, so I will replicate that and share the data.

    That will be interesting to read. YAML::Tiny has no non-core dependencies. Perhaps you accidentally installed all the suggested options?


    🦛

      Yup! Thanks for pointing that out! I've edited MyConfig.pm and will re-do the test with suggests_policy and recommends_policy at 0.

      Do you feel the 'large number of dependencies' point is valid for other distros?

      Chronicler: The Domici War (domiciwar.net)

      General Ne'er-do-well (github.com/LeamHall)

        I prefer to avoid large numbers of dependencies but that comes from an efficiency standpoint rather than a security one. For a persistent process, more dependencies usually means more RAM which could often be put to better use. For a non-persistent process, more dependencies usually means slower start-up time and that's not good either. It's not cast in stone - sometimes the trade-off is worth it.


        🦛

Re: Prefer Pure Perl Core Modules
by Fletch (Bishop) on Jul 13, 2021 at 15:04 UTC

    WRT to your item 5: If you're doing anything serious with Perl you DO NOT want to use the OS' perl as that way lies much pain. Doing so couples you tightly to the OS' upgrade schedule for both the language and (if you're using its package manager for them) CPAN modules. If you're in some sort of (e.g.) shared hosting environment where you don't have tight control over the OS and what's installed it's probably just a matter of time until something "upgrades" underneath you and you're stuck either pleading for a rollback or trying to patch your code to work around some issue. (edit Even if you do have some degree of control you can still shoot yourself when you find your OS upgrade installed a breaking change; speaking from experience here as I learned this principle the hard way . . .)

    You should always whip up your own "application perl" (by hand or with perlbrew) and use that to run your code and probably even use something like local::lib or Carton to maintain your CPAN dependencies.

    (Granted if your "OS'" perl is from a container context you could prossibly relax this requirement because you're in effect baking up your own controlled "separate" perl install; that's a bit different in the details but similar in spirit.)

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      Excellent answer.

      > If you're doing anything serious with Perl you DO NOT want to use the OS' perl as that way lies much pain

      Strongly agree!

      Unfortunately in Re^4: Rediscovering Hubris (in January) the OP stated:

      I also failed to communicate something that impacts what you wrote. I work on system tools that rely on the customer's OS version of Perl. We don't embed Perl into the tool, and the customer base tends to use older OS versions.

      You should also use the "shared hosting environment" strategy of having your own application-dependent directory listed first in PERL5LIB, installing your own copy of every package that your application needs, so that you do not have unmapped dependencies on the OS-installed versions of packages.
Re: Prefer Pure Perl Core Modules
by LanX (Sage) on Jul 14, 2021 at 14:51 UTC
    I have trouble understanding your post, since you mix notions about "core" and "CPAN"

    "core modules" are bundled with your Perl distribution, it doesn't really matter if they have XS dependencies since Perl itself is compiled too.

    They are not installed via CPAN (only if you desperately need an update and wanna risk inconsistencies) and they are maintained by P5P.

    see also

    Tho surprisingly is "core" not listed in perlglossary

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      My apologies, I'm not sure where the confusion comes from. I thought I made it clear that there were Perl core modules, and Perl CPAN modules. The other difference is "pure Perl", which means no XS. A Perl core module can be "pure Perl", or include C code. However, if the target node has a version of Perl compiled for that OS and architecture combination, then they should have what they need. If a CPAN module needs to be compiled then the target node needs to have a compiler toolset available. Some nodes cannot do that.

      Does that help?

      Chronicler: The Domici War (domiciwar.net)

      General Ne'er-do-well (github.com/LeamHall)

Re: Prefer Pure Perl Core Modules
by eyepopslikeamosquito (Bishop) on Jul 15, 2021 at 00:58 UTC

    I'm still not 100% clear on just how extreme your position is on this. A concrete example might clarify.

    Suppose your boss asked you to write a Perl script to read a large and complex XML stream from an important client and update a number of different relational databases with the extracted information. How would you tackle this task? Would you at least consider installing, say, XML::LibXML and DBI from the CPAN?

    See also a couple of old nodes on this topic:

      Excellent question! Hopefully my answer will be of the same quality.

      The decision requires a lot of data, and a good bit of thought. Is this an existing client with an established base of Perl software and skill? Are we installing our own node to do the work, or will the client install and support on their nodes? Do they have a relatively homogeneous environment or will the code need to run on different operating systems and CPU architectures? Do they have the ability (internet access, compilers) and skill to install from CPAN or do we need to package the dependencies with our application? If we package, how many OS/Arch combinations do we support, and do we have access to those combinations to build and test on? Is this complex enough to require compiling Perl and bundling that as well?

      I would probably do a Proof of Concept on an isolated VM, with compiled Perl and included CPAN modules. When we present to the boss, and the client, we discuss what support is needed, how upgrades are done, and who owns what responsibilities. If the resource consumption for maintenance via CPAN and compiled Perl is higher than the boss or the client want to commit to then alternatives can be discussed. Maybe we provide a VM for upgrades, or we use an older Perl that their OS vendor provides. Or we look at the subset of critical activities and evaluate how difficult they would be to replicate in-house or to contract out for.

      Chronicler: The Domici War (domiciwar.net)

      General Ne'er-do-well (github.com/LeamHall)

Re: Prefer Pure Perl Core Modules
by hrcerq (Scribe) on Jul 14, 2021 at 00:30 UTC

    Coincidentally I started a thread yesterday on a related subject. Until now, I've relied exclusively on operating system packages as integrity and authenticity checks are disabled by default, and depend on the programmer having signed the packages (which is not always the case).

    System packages, on the other hand, are always checked, can be uninstalled without surprising side-effects, and are more likely to be system-compatible. Unfortunately, using exclusively system packages means not getting many nice contributions available in CPAN repos.

    Usually, I tend to prefer security, efficiency and robustness over getting things done and working faster, though I recognize this often times is painfully difficult to put in practice.

    So, CPAN is a great help, but I find it a shame that such problems still weren't addressed. I'm hoping they will be, before I actually depend on CPAN for anything more than getting to know it. I'd be glad to help improve it, but right now, I'm clearly unprepared for that.

    return on_success() or die;

      Good thread and great comments! You're right, improving the security around CPAN would help and it would be labor intense. CPAN offers a lot to Perl users, but there are significant risks.

      Take Moose/Moo for example. Once you know how to build a Perl object without them, you can see the advantages they bring. Then you have to think through the trade-offs; is the benefit given worth the personal effort required? Once you go outside of OS provided packages you have to keep OS and CPAN installed modules separate, and that can become a real challenge. If your application needs a lot of CPAN you probably want to look at compiling your own Perl and completely moving away from the OS provided packages. My desktop provides Perl 5.32 and I do CPAN installed to /usr/local. It will cause a problem if I use a lot of CPAN, so I avoid that. Other places I provide code for are limited to a much older Perl, and I have to package any module my application uses. Thus I use few. Think about how much you want to share your code, and what someone would have to do to use it. The easier you can make it for your user the higher chance you have of your code being used.

      You can also look at virtualenv and carton; they may give you what you need. I do not use them because they add more administrative overhead than I want to deal with.

      Chronicler: The Domici War (domiciwar.net)

      General Ne'er-do-well (github.com/LeamHall)

        You can also look at virtualenv and carton; they may give you what you need.

        Maybe you mean plenv instead of virtualenv? (AFAIK virtualenv is used for python projects)

        Anyway, I think plenv helps in the sense that I can create application-specific repositories, instead of mixing dependencies for a lot of different applications, which could easily become a mess. From a security perspective, however, it remains the same situation, unfortunately.

        return on_success() or die;

Re: Prefer Pure Perl Core Modules
by mr_mischief (Monsignor) on Jul 29, 2021 at 19:37 UTC

    In response to points 1, 2, 3, 4, AND 5, there's an array of solutions any of which solve or partially solve all five of them.

    If you want to ensure compatible dependencies, build and test an image/package (in tar, zip, Docker, kvm disk image, RPM, APT/deb, flatpak, perlbrew, Carton, or whatever) and ship the image/package to production as a first-class semantically versioned entity.

    If you want idempotence of installations, make the image or package removable and reinstallable as a whole.

    If you want consistent upgradability, upgrade things in your build system and ship an updated, versioned image or package.

    If you want to know your security posture and be consistent with it, test, evaluate, inspect, and otherwise poke at your dependencies before or during build before shipping them to production. Yes, at this point fewer, smaller dependencies can be a boon if you don't trust upstream very much. You might find a bug in upstream code, be it security-related or not. But better that than shipping new bugs to production without looking.

    Don't depend on the OS's perl for your Perl application. Re-vendor your dependencies. Specifically, put your application and its dependencies, again, in an image or package for self-contained, repeatable installations. You can use perlbrew or another tool for this. We're currently evaluating replacing our own RPM builds of perl and modules with just throwing a perlbrew with all our needed modules into a repo or container. In any case, we ship our own perl and our own copy of all relevant modules.

    You don't necessarily need to go hog wild with high-end orchestration systems like Kubernetes, but the modern DevOps/DevSecOps/GitOps/application deployment movement has solved many problems associated with long-lived hand-rolled server systems. They mostly solve that with the very simple philosophy of "well, don't do that then". Just don't have unmanaged systems, manually upgraded systems, or systems with periodic configuration management runs in production when you're building new applications. Just don't, and your problems 1 through 5 are largely solved, mitigated, or simplified.

Re: Prefer Pure Perl Core Modules
by Anonymous Monk on Jul 15, 2021 at 13:23 UTC
    Some CPAN modules don't have a "pure Perl" equivalent that actually does the same thing. Some, like XML::LibXML, are simply "wrappers" for binary libraries. (In this case, one that is an accepted industry standard for managing XML, and therefore quite likely to be the same one that produced the file that your program is now consuming.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11134957]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2022-07-04 05:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?