http://qs321.pair.com?node_id=506511

My job consists of writing web applications that are perl programs running under Apache::Registry, and calling other perl modules that I've written. Over the past years my development process has improved to include some elements of a "professional", such as testing, a testing server, source control, documentation, coding style guides, etc.

Over this time I've noticed a dearth of documentation on how people do their work in perl/on the web. Most established procedures follow the old C model:

Personal machine:

Dev server: Production server: Which doesn't meld at all with the way I've been doing web development: I know these questions have been answered by others wiser than I...but I'm not finding the answers. The material on perl testing I've seen is all based on testing stand-alone modules, not web-apps. When I asked the question of the gurus at YAPC, I got some blank looks and statements that the "normal" process works fine for their apps, so maybe I just have a contorted version of what is "normal".

I have lots of theoretical scenarios, but I'm curious to see what people are REALLY doing in their environments, because I don't have ample time to retool to and debug a theoretical environment.

Help appreciated.

Edit: g0n - moved from SoPW to Meditations

Replies are listed 'Best First'.
Re: What is YOUR Development Process?
by dragonchild (Archbishop) on Nov 08, 2005 at 02:30 UTC
    I do web apps. My ideal world:
    1. Everything is under source control I use Subversion, and recommend it highly.
    2. Everything has an automated test that, at the bare minimum, verifies that it will compile. Preferably, everything is tested to at least 95% coverage, with 99% being the goal.
    3. Everything is developed and tested in the developer's environment. This may be a separate machine or it may be a separate port on the same machine. I've done both and there's no benefit vis-a-vis development process. If you check something in, you're asserting that you have done at least some manner of testing. In other words, HEAD will, at the very least, compile.
    4. Either the full test suite is run whenever a checkin occurs (ideal) and/or it runs nightly on some smoke server (good). Best is both.
    5. Did I mention that EVERYTHING is under source control? This includes anything that your application might need in order to run.
    6. Any external items (CPAN modules, system commands, etc) and their versions (min and max) are marked as dependencies somewhere in the installation script. It is so annoying when you try and install something and it breaks because it forgot a dependency. (The automake19 port on FreeBSD 5.3 has this problem.)
    7. Everything that happens in the application is logged. Every request, who made the request and from where, how long it took, and if there were any errors. If you have the space (and you probably do), log the response as well. Tracing errors is a lot easier when you can see exactly what your app sent back in response to what.
    8. If an error occurs on prod, it should, at the very least, email someone. Preferably, someone gets paged. Customers love it when you call them 15 minutes after the site says "I can't let you do that, Dave.", apologize for the inconvenience, and inform them of both a workaround and that the problem has been logged as a bug. If you have a pager, rotate it. That way, everyone gets experience.
    9. If you don't have code reviews, write each other's tests. You have to, at the very least, have working familiarity with the APIs that your colleagues are working on. Otherwise, how can you possibly suggest improvements at the time when it's cheapest to make those changes?
    10. The dev machine/dev port is for developer integration testing. It's also useful for demonstrating to management how the new unfinished feature will look like.
    11. There is a completely separate environment called "test". Ideally, this is on a separate machine in order to test the upgrade protocol, but it doesn't have to be. This is the place where the testers test and, most likely, UAT will occur here.
    12. Every modification to the application has a change request associated with it. This means features, enhancements, bugs, and retirements. They are all change requests and need to be identified. All checkins (ideally) are for a single change request and should be marked with the request ID.
    13. When doing an upgrade, every action is scripted. This means that if a table has to change, the ALTER TABLE statement(s) are in a script, preferably identified by the change request ID that it is handling. If the upgrade is complex, there should be a single master script that will call all the other scripts to do the work. These scripted actions are how you move changes from dev to test.
    14. There is a prod environment. This will probably be multiple machines. No-one is ever logged into prod. Ever.
    15. There is one single person responsible for any given install. Ideally, each person gets to handle a prod install. That person is the only person allowed to "pull the trigger" on the install. In other words, they are the only person logged into that machine during the install window. And, this is the only exception to the prior rule.
    16. There is a maintenance window on prod where the client understands installations may be occurring. Unless it is a crash-bug, this should be the only time changes are made to prod. Period.
    17. Let me repeat - Prod is sacrosanct. If you need to find something, look in the logs. That's what they're for. If you can't find it there, log it for next time. Every time you touch prod, YOU WILL BREAK IT.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      "There is a maintenance window on prod where the client understands installations may be occurring. Unless it is a crash-bug, this should be the only time changes are made to prod. Period."

      At my workplace each machine on the farm runs two web servers one with the production code and the other with the final QA code. Come release time once a week we switch the network so the final QA web servers become production then update the old production web servers with the next QA version in the pipeline.

      No maintenance downtime and the final QA code is checked on the exact same production machines.
        Do you also mount /usr/local so that both servers mount from the same physical location? That way, you guarantee that the perl installs are identical ...

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      I'm just another anonymouse sysadmin, but I wanted to make one comment: There needs to be a way to back-port the prod environment into test (and possibly even to dev). This way you can recover from situations where the upstream environments have become gunked-up with oddities that just naturally happen during the dev-test-prod cycle. And this backport from prod to test needs to be either a script, or a full backup/restore cycle, to ensure complete fidelity to prod.
Re: What is YOUR Development Process?
by philcrow (Priest) on Nov 07, 2005 at 20:37 UTC
    I too am a web developer (primarily). Our shop uses mod_perl (currently for Apache 1, but we've abstracted that out and could use Apache 2).

    We have a development/test server. Our svn lives on another box, this gives us about three backups each day (one per developer and one for the svn box).

    We have a standard set of templates which are under version control. Each app can also have its own templates (also under version control). Documentation in the templates and in the corresponding app modules is key, but standardization is more important.

    We have a home brew application framework (which is open source, but is not ready for public viewing just yet, but look for announcements in the next couple of months). We also have an app generator with its own little language. It makes the sql statements to build our database (in Postgres), the httpd.conf we need to include in our httpd.conf, the Class::DBI subclasses which model our data, the template toolkit wrapper (for navigation) and the controlling modules (with basic CRUD and stubs for other things). There is a normal test suite for the generator, so we don't retest the code it makes later (except to make sure that it compiles).

    Once the app is generated, each developer checks out code to their home and hacks away. When something seems right we ./Build test (which for us does compile tests only). Then we restart our personal apache server whose conf has a use lib for our checked out source. This keeps us from affecting the other person(s). If it looks right in a browser, we check in (and possibly announce by turning to the other developer and talking).

    If the data model is flawed, we update the description file and regenerate, our generator carefully avoids overwriting code written by hand.

    If an app involves cross box communication, we use a second test machine for the purpose. This also allows us to test how connectivity will happen by controlling which firewalls separate machines.

    Some of these things would be different if there were more than 2 full time developers or if there were more users of our apps (most of which serve employees). For instance, we might automate testing of the finished pages. Now we just eye ball. Our apps are so similar, and so much of them is generated, that more testing still seems like overkill. Some of our apps haven't had maintanence (neither bug fixes nor feature additions) since I started in March.

    For deployment, we ./Build test, then ./Build dist, then move the tar to a staging directory on the prod box. There are multiple prod boxes, but most apps live on only one (that makes things a lot easier and shows the size of our operation). Then we visit the box with ssh and repeat ./Build, ./Build test, ./Build install.

    Phil

Re: What is YOUR Development Process?
by badaiaqrandista (Pilgrim) on Nov 07, 2005 at 23:34 UTC

    # How do you compensate for templates, template plugins (I use Template Toolkit (Template)), application modules, instance scripts (I use CGI::Application) and non-application modules being inter-dependant?
    I put everything I code in version control. For modules, I depend on debian's packaged perl libraries. If I can't find one, I create a deb file with dh-make-perl and put it in version control. If I want to modify a library's behaviour, I copy the library to version control and modify it and make sure my app use it. I make my 'build' script to apt-get all those perl libraries and install the deb files I put in the code.

    # How do you write tests that depend on your webserver config (and its environment) and package them with your module?
    I don't write automated test :(. But configuration environment is abstracted into a module. I also create template files for apache configuration, so in every box I only need to add 'Include' directive to include it.

    # How to you get your personal machine to emulate the environment of multiple machines?
    I don't. I just make sure the code works correctly, and all multiple machine related problems are fixed in production. That's scary, but that's all I've got right now.

    # How do you move material to your test server, and from there to production server?
    svn up?

    # How do you do some of these things in a multi-user environment? (Is the test server a "check out"? Who checks it out?)
    All instance of the web app are actually an svn checkout directory. So I only need to do 'svn up' to upgrade it to the new version. I don't tag the versions because my environment is very small and informal and no release management at all. Two major number versions can coexists in one box as two checkout directory of different branches. But each has its own apache configuration files.

    My wishlists for my app are:

    • Have an automated test
    • Have an installation script
    • Have a code monkey to code for me so I can read perlmonks all day
    Badai
      # How do you move material to your test server, and from there to production server?
      svn up?

      Do you do so on a per-app basis (updating templates, modules, and cgis separately), or on a per-server basis (updating everything on the box at once)?

        i update everything together. i structure my svn repository like this:

        /
        -- lib
           -- modules
           -- shared template includes
        -- app-1
           -- templates
           -- cgis
        -- app-2
           -- templates
           -- cgis
        

        i put modules and shared template includes (i use mason just as an example) in lib because very likely, different apps need to share functionalities and some general templates (e.g. templates to show currencies)

        the benefit of this structure is that i have everything in one tree so i don't need to remember which module is needed by which cgi script with which version. the downside is that it takes a lot of space, but space is cheap nowadays.

        Badai
Re: What is YOUR Development Process?
by InfiniteSilence (Curate) on Nov 07, 2005 at 21:01 UTC
    # How do you compensate for templates...

    Put them in a version control system.

    # template plugins ...

    Use a version control system in binary mode

    # application modules

    Put those in a version control system too

    #instance scripts

    Do you mean configuration files that are different on different servers? Put those in a version control system.

    #and non-application modules being inter-dependant?

    What are non-application modules?

    # How do you write tests that depend on your webserver config (and its environment) and package them with your module?

    You sure you want to do that? Why not separate the configuration stuff that is server/environment specific? My guess is that a module should be configurable regardless of environment (if possible).

    # How to you get your personal machine to emulate the environment of multiple machines?

    There are virtual machines that can do that (Google: vmware)

    # How do you move material to your test server..

    FTP/RSYNC?

    #... and from there to production server?

    Same...

    # How do you do some of these things in a multi-user environment?

    Describe a non-multi-user-environment.

    #...(Is the test server a "check out"? Who checks it out?)

    Depending on your version control system you typically tag all of the files you want for a release and then checkout everything with that tag. It would be a good idea to put it on a test server at that point. A good person to do this would be a project manager or someone equivalent.

    Celebrate Intellectual Diversity

      Put them in a version control system.

      The question was how to handle the inter-dependence of these items, not how to version them.

      What are non-application modules?

      CGI::Application uses a model where you take the normal if/else ladder analyzing the input and create a module with different methods that get called for different input. You create an "instance" script that sets any paramters and calls this module. Non-application modules would be modules that are used by the application module, but don't assume I/O via the web (and thus would include My::Module as well as DBI). There's nothing magical about it: It's just a standardization of what people did without a framework.

      Describe a non-multi-user-environment

      My personal machine, where a check out can be owned by me with no problems, vs the shared machines.

      you typically tag all of the files you want for a release and then checkout everything with that tag.

      If application Foo depends on module Baz, and Application Bar depends on a newer version, how do you mark those dependencies? What if the required module is not of your authorship, such as a DBD, and thus not in the version control? Do you tag every file in the dependency chain? That could get very long and puts extra effort on the tagging process.


      No offense, but your answers are exactly the kind of material I've been seeing. "use version control" without telling me how to synchronize files of different types. "Describe a non-multi-user-environment" when the apparent majority of perl developers are in shops of only one or two people, and very much effectively not multi-user development/ "rsync the files to your production machines" when I'm worried about making sure all the proper files are copied and tests pass. Your answers may be correct, but they don't actually tell me what I need to be able to implement a real system.
        If I understand you correctly, you're trying to solve a very difficult problem that you don't need to solve.

        The problem that you're trying to solve is how to manage a situation where you have many different components which have cross dependencies and are released on independent schedules. That's a very hard problem, not the least because each component needs to know about what is happening with every other component that it might care about.

        But you don't need to solve that. Put everything in version control and have one release cycle where you release everything. Every time. Now all of your version dependency problems go away. Rather than needing to know all of the combinations that might work together, you need only know that this combination works together. If you set things up carefully, the entire application can live in one source tree, allowing you to have multiple copies on one machine that do not interact with each other. (Configuration modules that set the right path are a good thing.)

        Most people don't do this with modules that aren't under your control, such as a DBD. The solution there is to rely on modularity. Make sure that every production machine has the same versions of everything. If you want to upgrade a key module, make sure that the old and the new work the same as far as your application goes (regression tests are a good thing here), then switch only that module, everywhere. If the part of the API that you're relying on hasn't changed (generally this is true, though you need to test this assumption), then it doesn't matter at the point of rollout whether you're using the old or the new.

        The motto here is "work in reversible and independent changes". That way each change can be tested and rolled out. If anything breaks then you know what it was and can easily roll it back.

        But if you want to really be paranoid, what you can do is have a special subdirectory for external code. And then everything can be in there. Personally I don't like doing that though, since I've had worse experiences with binary incompatibility between machines than with moderately careful system administration as described above. (For instance I've been left with no incremental upgrade path between using different versions of Linux - which is something that I can't put into source control.)

Re: What is YOUR Development Process?
by neniro (Priest) on Nov 07, 2005 at 21:15 UTC
    The posters above said a lot of useful stuff about technical details. I'll focus on something else:

    First I take paper and pens and start drawing. That's pretty old-school and may sound odd to some of you, but I like to sketch my classes and tables and whatsever on paper - and if I change something I redraw the necessary parts. I need to see what I'm doing (that's also the reason why I like Data::Dumper and GraphViz and ...). If I have to stop my work for quite some time - to work on some other projects, I just have to look on my papers if I restart working on my project.

      I forgot to mention the drawings. They are sitting right beside me. I have a data model (second draft -- project started last week) and screen shot mock ups. If object interacation or network communication gets complicated, I'll make a sequence diagram or two (with UML::Sequence). Or, I might just make an outline of the proposed callstack/protocol in a text file.

      Phil

      It's certainly "old-school", but then again, so am I. I'm not a professional programmer. It's a pasttime for me. However, any time I undertake a non-trivial project, I get out a small journaling book to write in, and first work out the general scheme - diagrams, flow charts, if using a GUI then a general sketch of how it ought to look. Over time I update the journal, but at no time do I really do without it. This may be more a reflection of how I was taught (mostly on paper), than necessarily a good idea.

Re: What is YOUR Development Process?
by jimX11 (Friar) on Nov 08, 2005 at 06:07 UTC
    I do the following:
    cd ~/src/SomeWebApp make clean cvs up -Pd perl Makefile.PL -httpd /usr/sbin/apache -port select make emacs lib/Foo/Bar.pm & make test

    If the tests pass I check the changes into cvs.

    To install, on the prod server I do make install (more or less).

    It's easy to develop web apps this way when you use Apache::TEST (A-T). You can have an highly customized apache config run your tests. And A-T track dependencies too. All in cvs.

    My current problem seems to be that I'm stuffing too many tests in it. Testing business rules for base tables, for example. Maybe every city in the city table for some web app must have at least one row in the apartments table. It's easy for me to make a test that checks that, only that test is not really related to the web app, other than it makes sure the data is as we expect.

    I started checking the data (every city must have an apartment) using the web app (Test::WWW::Mechanize running inside of A-T), but checking the module that pulls the data makes more sense.

    So what I'm working on now is having a subdir in the test dir, t/, that holds these back end tests. At night I'll run all the tests, and during the day, I'll run only the tests that are directly related to the web app before checking in during the day.

    Apache-Test really allows you to carry your custom httpd.conf with your perl modules and your perl scripts and your tests suite in cvs (or subversion). It's not that hard to test things using A-T. I even have a test that does html validation and link checking for my static pages - all my static pages are in cvs (and all are build using ttree from Template Toolkit). For logging I use Log::Log4perl, btw.

Re: What is YOUR Development Process?
by perrin (Chancellor) on Nov 08, 2005 at 20:38 UTC
    Something that has helped me a lot recently is to package ALL the dependencies for an application together, with an automated build process. Take a look at the way Krang handles this. It includes the application code, the templates, the executable scripts, the CPAN modules sources, and even the apache server source in one bundle that gets built by one script. This goes a long way towards fixing dependency problems and "wrong version" issues with CPAN. It supports building on the target machine, or building a binary that can be installed on multiple target machines later. I'd like to see packaging like RPM and DEB as an option eventually too.

    An additional advantage is that you can put multiple versions of the app on the same machine safely. You have to be aware of things like common ports of course, but it allows you to have two versions of the same module on the same machine with no conflicts.

Re: What is YOUR Development Process?
by talexb (Chancellor) on Nov 08, 2005 at 16:24 UTC
      How do you compensate for templates, template plugins (I use Template Toolkit (Template)), application modules, instance scripts (I use CGI::Application) and non-application modules being inter-dependant?

    Templates are part of the web application and get checked in (and I use Template Toolkit as well). Likewise, template plug-ins and instance scripts get checked in.

    The 'non-application modules' are part of a system installation that we track using Red Hat RPMs. That's not easy, but I work with the SysAdmin and we keep that system in check.

      How do you write tests that depend on your webserver config (and its environment) and package them with your module?

    Because we used a kind of Rapid Application Development approach without tests, we don't have any automated tests for the web application. Yes, I wish we did, and I hope to be able to use WWW:Mechanize to address that in the future.

      How to you get your personal machine to emulate the environment of multiple machines?

    I don't -- I use a combination of my development system and my test system to develop and test new versions of the web application.

      How do you move material to your test server, and from there to production server?

    I wrote an installer for the web application that's part of the checked in source. I extract a specific revision from the source control system, run the installer, answer a few questions, do a very small number of tweaks (they're on the ToDo list for the installer) and the system is ready for a smoke test (if this is for a Production system).

    Once the smoke test passes, the system is passed on to the customer rep, who passes it on to the customer.

      How do you do some of these things in a multi-user environment? (Is the test server a "check out"? Who checks it out?)

    I'm not sure I understand your question; I'm the only developer for the web application, so I don't have to negotiate with anyone for a common file that we both want to work on. I can check files out of the source control system onto the test system for development, just because it's a more capable system than my own workstation for some configurations. That's not a big deal.

    Hope that answers some of your questions.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Templates are part of the web application and get checked in (and I use Template Toolkit as well). Likewise, template plug-ins and instance scripts get checked in.

      What do you consider a "web application"? If I have a CRUD app module (as mentioned elsewhere), I can have multiple instances of that module (meaning one physical copy of the module per server, but multiple instance scripts and template sets so that there are several "apps".

      The 'non-application modules' are part of a system installation that we track using Red Hat RPMs.

      I'm including CDBI backends, and any other modules we write in the "non-application modules" category. Are you? If not, are they "part" of your application? If not (and most of them are not application specific), what do you do with them?

      Yes, I wish we did, and I hope to be able to use WWW:Mechanize to address that in the future.

      After YAPC I started using Test::WWW::Mechanize, and I can say that it is a huge timesaver on repetative app testing. I can also say that most of my apps don't have a full suite of tests, but as soon as I need to do it, it takes about the same time to write the test as do it by hand, and it's much faster to run again. And again. So you don't need to adopt full testing to get some benefits.

      I'm the only developer for the web application, so I don't have to negotiate with anyone for a common file that we both want to work on. I can check files out of the source control system onto the test system for development,

      How do you select what files to check out? (is everything by itself, or are they tagged as a set?) What do you do when you're done testing? I have a designer that will edit templates, how would you recommend I get them in on it (they run windows and FTP changes to the servers).

          What do you consider a "web application"? If I have a CRUD app module (as mentioned elsewhere), I can have multiple instances of that module (meaning one physical copy of the module per server, but multiple instance scripts and template sets so that there are several "apps".

        A web application is a piece of software that runs through a web interface.

        I do have multiple instances of my application on a server; each one runs for a specific customer on its own web server, using its own database and directory path.

        If I were being awfully clever I could have one set of code and multiple sets of data .. but I don't have to be that clever. Yet.

          I'm including CDBI backends, and any other modules we write in the "non-application modules" category. Are you?

        Are you talking about a Class::DBI backend? I'm not even sure I know what that is. Can you explain?

          How do you select what files to check out? (is everything by itself, or are they tagged as a set?)

        After installing (which means I have read-only copies of all of the files), I check out individual files against a specific issue. That makes them writable.

          What do you do when you're done testing? I have a designer that will edit templates, how would you recommend I get them in on it (they run windows and FTP changes to the servers).

        When I'm done testing, I check everything in and ask the Release guy to make a version out of it. That gets passed to the QA person for a smoke test.

        We've decided on three levels of builds, development builds (just for me), milestone builds (to mark the completion of a specific set of features and bug fixes) and releases (to mark a milestone that is sufficiently good to give to the customer).

        If you have a designer that wants to edit templates, set them up with a userId on your source control system and show them how to check files out and then back in again. If you're on Linux and they're on Windows, perhaps have them ssh in to their home directory on a Linux system, use a command line interface to check files out, then ftp those files home for editting and testing. When they're done, ftp the same files back and check the files in.

        Alex / talexb / Toronto

        "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds