http://qs321.pair.com?node_id=11118570

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have to work with old Perl code grown organically, written by others over the period of time, with 1000s of lines, and sometime with single function over 100 lines. I often think, to convert it into organized fashion with small subroutines and clear logical flows, so that I can maintain it better going forward. I am afraid to compromise business logic. What solutions are recommended ? Thanks.

Replies are listed 'Best First'.
Re: Working with old code
by Your Mother (Archbishop) on Jun 26, 2020 at 17:04 UTC

    Write tests for every single part of the business in:out you can possibly figure out. Perl testing, even system testing, is extremely easy and flexible once you get a feel for it. Once you have a deep test suite, you can refactor with confidence and speed because you will know immediately if you broke something or changed behavior.

      ++ ... and run coverage (Devel::Cover) to prove the tests hit what you expect them to hit.
Re: Working with old code
by haj (Vicar) on Jun 26, 2020 at 19:02 UTC

    You are not alone. Throughout my business life I have worked with old code written by someone else (To be honest, I've hardly ever written new code).

    As others have recommended, a good set of tests is absolutely necessary for such a refactoring. Old code is often battle hardened: It covers some edge cases in ways which are hard to understand because back then it seemed to be the easiest way to fix a long forgotten bug. Probably you'll need to write a lot of tests before even writing the first line of your new code.

    I also recommend to refactor only when you have a concrete goal in mind: Some new feature, better security, whatever. Don't write the feature right now, but think about how you would like to integrate the new feature, if only the old code would allow it. Then refactor the relevant parts of the old code, still without changing its behavior - and only when you're done with that, add the new feature. Or, in other words: Don't refactor just for the fun of refactoring. Subroutines with more than 100 lines aren't immoral.

    The details depend on the ecosystem of the code:

    • Does the code have a test suite? Unit tests and/or application tests? Good. Still, add your own tests. This will go a long way to understand the old code. The test coverage can also be helpful to decide where to start.
    • Do you have a source code repository? Good. Does it contain good commit messages? Even better. Reading those together with the old code can help to avoid re-introducing bugs, because that's really embarrassing. If your code dosn't have a source code repository, start one. Also, grow the habit to write good commit messages: If you are new to the code, you might want to revisit your first decisions when you're more familiar with it.

    So, to sum it up, my recommendation is: Don't focus on the code alone. Improve the ecosystem as well: Documentation, tests, tools. If the old code is still relevant (why else would you have to work with it?), then there are chances that you need to pass it to someone else in the future.

Re: Working with old code
by Tux (Canon) on Jun 26, 2020 at 19:32 UTC

    I don't get what people dislike about a 100+ line method/function/subroutine if it has no reason to be split and it the chuncks in the sub have been well separated and commented. There are ereasons enough to keep something that is only ever run as one process to keep together. If none of the parts is reused or used by others there is no reason whatsoever to break it apart, which only *complicates* readability.

    Most people that I met that want to break up long sub's seem to come fro a java-enviroment where people are tought that long subs are bad and that even quality checks mark these methods as "bad" as they are too long or too complicated. BULLOCKS!

    I 100% agree with the other comments. Start with tests and documentation.

    If a found bug needs a fic that would warrant a break-up, only then consider to break up in smaller pieces. Really smaller subs are by no means always better to maintain.


    Enjoy, Have FUN! H.Merijn
      I don't get what people dislike about a 100+ line method/function/subroutine if it has no reason to be split and it the chuncks in the sub have been well separated and commented.

      I agree that long subroutines aren't necessarily evil. It depends on the case: You have listed some conditions... which quite often aren't met in old code. Subroutines grow, because, let's just add a small feature here, because of why not, only one source file to be changed, and then there's another edge case to be covered, let's just copy that check from the other file where it works just fine. So in my experience long subroutines are often due to "maintenance with small commits", so to minimize the effort for code/patch review, and not because the subroutines were designed that way.

      BTW: My most recent bugfix was in a 1120-line sub, mostly written in the previous century, and guess what: Now it has 1123 lines. And I'm not going to split it anytime soon.

      I’m on your side, I just want to say I think “one screen” of code for a sub is a decent rule of thumb that catches a lot of problems; especially for coders like me who often write code like prose without always having a plan. It’s the point where I ask “Is this the best way to do this?” but not “Oh, no! Better cut this into parts quick.”

      I would also rather have a LONG sub than 5 private classes, not an exaggeration in my (well, our) code base, or whatever a Java-oriented hacker might use instead. One of the Perl hackers in my group—designed and wrote most of the early code—was Java-oriented and it’s ostensibly “better engineering” but in practice close to zero-reusuability, widely-denormalized because of it, painful, and confusing code.

Re: Working with old code
by eyepopslikeamosquito (Archbishop) on Jun 27, 2020 at 03:05 UTC
Re: Working with old code
by Fletch (Bishop) on Jun 26, 2020 at 19:30 UTC

    Somewhat more aimed at Java-y OO, but Refactoring by Fowler is a good backgrounder for sound approaches.

    But if you don't have tests, make those first. Even if you've only got a minimal set of tests to start with it's better than nothing.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Working with old code
by BillKSmith (Monsignor) on Jun 26, 2020 at 20:39 UTC
    It is always hard to know when to retire legacy software. It is almost always faster (cheaper) to shoe-horn in one new feature than it is to start over. However, we often see with hindsight that if we had paid the start-over price three versions ago, we would already have recovered that cost and every future version would have been cheaper.

    We only see a cost benefit in a redesign if we really do make a significant improvement and if enough future changes benefit from it that we recover the investment. On the first point, our new designs are seldom as good as we expect. Customers seem to find new requirements that do not quite fit with our brilliant designs. On the other hand, the need for any program usually lasts much longer than we expect. In summary, you must predict the number and nature of future of changes. This seems impossible, but remember that your best guess is almost certainly better than not even trying.

    Bill
Re: Working with old code
by Marshall (Canon) on Jun 29, 2020 at 09:40 UTC
    A 100 line function is not wrong on the surface -that's just a page and a half or maybe 2 pages of code.
    Your first step should be to understand the existing code.
    I don't see how you can refactor the existing code if you do not understand how the existing code works.
Re: Working with old code
by perlfan (Vicar) on Jun 27, 2020 at 07:01 UTC
    modulinos provide a really nice incremental path to modularization with some immediate benefits of making your scripts testable. So that'd be my recommendation. Start small, break out as time and risk permits. I have converted many scripts this way.
Re: Working with old code
by karlgoethebier (Abbot) on Jun 27, 2020 at 09:07 UTC

    If I were in your shoes I wouldn’t hesitate and rewrite the stuff. Remember: «Better an end with terror than terror without an end». A nice approach would be to use Class::Tiny and Role::Tiny. See also Class::Method::Modifiers. Especially around mighty be interesting.

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

      I wouldn’t hesitate and rewrite the stuff

      Though it might feel more enjoyable, do you have a sound reason to believe you'll do a better job than the original author/s? Are you smarter? Do you have more time? Do you understand the business better? Have the business constraints changed?

      Some cautionary advice on the perils of rewriting from scratch:

      Netscape 6.0 is finally going into its first public beta. There never was a version 5.0. The last major release, version 4.0, was released almost three years ago. Three years is an awfully long time in the Internet world. During this time, Netscape sat by, helplessly, as their market share plummeted. It's a bit smarmy of me to criticize them for waiting so long between releases. They didn't do it on purpose, now, did they? Well, yes. They did. They did it by making the single worst strategic mistake that any software company can make: They decided to rewrite the code from scratch.

      It's important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don't even have the same programming team that worked on version one, so you don't actually have "more experience". You're just going to make most of the old mistakes again, and introduce some new problems that weren't in the original version.

      -- Joel Spolsky on not Rewriting

      Now the two teams are in a race. The tiger team must build a new system that does everything that the old system does. Not only that, they have to keep up with the changes that are continuously being made to the old system. Management will not replace the old system until the new system can do everything that the old system does. This race can go on for a very long time. I've seen it take 10 years. And by the time it's done, the original members of the tiger team are long gone, and the current members are demanding that the new system be redesigned because it's such a mess.

      -- Robert C Martin in Clean Code (p.5)

      More detail on this topic can be found at: Nobody Expects the Agile Imposition (Part VI): Architecture

        «...a sound reason...»

        Sure. Consider this true story: Once upon a time a young C programmer needed a little script. In his hubris - a real Dr Wisenheimer - he wrote it in an obsolete Perl 4 style. Everything self made. No libs etc. Some years later a bash programmer (a real Perl hater) inherited the script. At this point the little script was already swollen to a monster of many hundred lines of code. The basher left the company and a real Java fan boy inherited it. He added tons of features and libs. Now this whole crap is by far > 1000 lines and really stinks. It is mission-critical. If it fails: Airplanes grounded, 100 printers down, VPNs broken, monitoring runs amok, what ever desaster you can imagine. And a poor 24/7 admin needed to restart the stuff every time it crashed. Sure, this is prose. A short summary of my experience. The statement is: Sometimes it’s better to make a hard break. Who will be killed if such crap fails? The poor guy which inherited it. And: If you want to maintain or extend such stuff you need to understand it’s logic. BTW, my second project ever was a complete rewrite of the crap I wrote in my first project. After I «discovered» how it really should be done.

        Best regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

        perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

      To rewrite the stuff is an advice which is often given in third person. Seriously, if the OP has doubts about the business logic, then rewriting is a bad advice, regardless of how funky an object model (or modulinos) might be.

      Rewriting is fine if you can afford to do it for fun, or if you have a sponsor/employer who pays you for exactly that job. Otherwise: just don't do it.

      >rewrite the stuff

      This would be nice, but is often not possible when dealing with tons legacy code or thousands (or millions) of LoC. There are many reasons for this: time, risk, money - but most importantly, business logic - much of which is baked into legacy code. Besides, once you move to modulinos (my suggestion) then progress to full modules, it's then much more convenient to refactor in a more modern way as you suggest. Consider the number of man-hours put into the "working" code - it will take even more time to replace the code due to the requirement to necessarily understand the code (even if you wrote it) and replicate all the assumptions both inside of the code and by anything using the code as a black box. It's super hard. Personally, I aways recommend against rewriting or fully replacing code, especially replacing Perl with another, inferior language xD.