Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Fast processing of XML files for CGI

by AcidHawk (Vicar)
on Dec 08, 2003 at 10:21 UTC ( #313047=perlquestion: print w/replies, xml ) Need Help??

AcidHawk has asked for the wisdom of the Perl Monks concerning the following question:


I am running ASPerl 5.6.1 on Windows 2000 using Apache as the web server. I need a quick solution to display a page while our production Helpdesk has it's legs in the air. This solution may need to be used again in the future, so I would like to start with a quick win but expand this into something more a little later.

The Problem: I have an automated process that creates small xml files in several dirs. These files can number upwards of 200 files per dir and about 7 dirs.


Each XML file looks similar to the following:

<?xml version="1.0"?> <opt> <Agent_Class>Backup</Agent_Class> <MD>dirA</MD> <Agent_Instance>NetBackup</Agent_Instance> <Date>2003/12/08</Date> <Server>ServerABC</Server> <Instance_Detail>Backup Problems ServerABC:57</Instance_Detail> <Time>08:04:58</Time> <Header>X-CALL</Header> <State>CRITICAL</State> </opt>

Basically I need to display a table with the dir name and some of the contents of all the files in its dir. Something like:

Managing ServerCall Details
ServerABC - Backup - NetBackup - Backup Problems ServerABC:57
ServerDEF - Drive - Space - Drive C: is out of disk space

This is a snippet of what I have at the moment which is proving FAR too slow:
if (opendir (DIR, $path)) { while( my $dir = readdir( DIR ) ) { next if( ( "." eq $dir ) || ( ".." eq $dir ) ); if (-d "$path/$dir") { print "<TR><TD>$dir</TD><TD></TD></TR>\n"; print "<TR><TD>$dir</TD>\n"; if (opendir (CUSTDIR, "$path/$dir")) { my $count = 0; while (my $file = readdir( CUSTDIR)) { next if (("." eq $file) || (".." eq $file)); $count++; if (eval { $callvals = XMLin("$path/$dir/$file") } +) { # $calldetails{$callvals->{Server}} = "$callval +s->{State} $callvals->{Agent_Class} $callvals->{Agent_Instance} $call +vals->{Instance_Detail}"; print "<TR><TD></TD><TD><B>$callvals->{Server} +</B> - $callvals->{State} - $callvals->{Agent_Class} - $callvals->{Ag +ent_Instance} - $callvals->{Instance_Detail}</TD></TR>\n"; } else { die "Cannot Read $path/$dir/$file: $@\n"; } } close (CUSTDIR); print "<TD>$count XML files</TD></TR>\n"; } else { die "Cannot find path $path/$dir: $!\n"; } } } close(DIR); } else { die "Cannot find path $path: $!\n"; }
I thought of putting all the data from the files into a hash so I only had to process the relevant bits when I build the web page.

What can I do to be able to read these files and put some of the detail in a web page before the web page times out or tries to refresh itself (120 Secs)?

It must be said that I am using CGI, but that cgi/html is NOT where my little experience lies..

Of all the things I've lost in my life, its my mind I miss the most.

Replies are listed 'Best First'.
Re: Fast processing of XML files for CGI
by inman (Curate) on Dec 08, 2003 at 10:51 UTC
    Two strategies spring to mind:

    Process Files asynchronously - Use a background process to do the work and produce a static HTML file that can be served to a number of clients. This adopts a write once read many strategy.

    Concatenate and transform - Concatenate the XML files into one large XML file and use XSLT to transform the XML into an HTML page.

    In both cases I am assuming that you are looking for the HTML page to be refreshed regularly (e.g. every 30 seconds or so) and be viewed by a number of people (the help desk operatives).

      The following code relates to the earlier post and shows the processing of a concatenated XML file using an XSLT. This has been tested using a Xalan on the command line but this is available as XML::Xalan.

      Example of concatenated XML reports:

      Example of XSLT to do the conversion:

Re: Fast processing of XML files for CGI
by Roger (Parson) on Dec 08, 2003 at 13:16 UTC
    Have you thought about preprocessing the XML files? If your XML files are fairly static, then you can, say, have a background process that wakes up every 10-20 min, check for new or updated files. If new files found, then update the cached results. If no new files found, then exit. This way, all you have to do is to display the cached result, instead of doing XML parsing on the fly.

Re: Fast processing of XML files for CGI
by CountZero (Bishop) on Dec 08, 2003 at 23:10 UTC
    Looking at it from a different perspective (and not immediately helpful): did you think of putting your data in a database instead of in a lot of small XML-files?


    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      Typically what happens is that, when an event occurs in our organisation and action is triggered which creates one file for the event that occured. There are other processes that monitor for these dirs for the xml files and as soon as they appear will handle them. I.e. log/update/close a call in the heldpesk Once these background processes have handled this event xml the file is deleted..

      The problem is that if the helpdesk is down, we stop the processes that handle these files. So every time there is an event the files get created but not removed. We would like to keep it like this because then when the helpdesk comes back up we can simply start these processes again and all the event files will be processed (in sequence).

      What we need, while the call logging processes are down, is visualisation for the operators. This is what I am trying to accomplish with a view of each dir in a web page. Some way of viewing the contents of the files. I would like to extend this a little also in that if an event is 'Critical' and later there is a corresponding event which changes the status to 'Repaired' I don't want to see this in the web page.

      Of all the things I've lost in my life, its my mind I miss the most.
Re: Fast processing of XML files for CGI
by CountZero (Bishop) on Dec 08, 2003 at 23:15 UTC
    If your users have IE 6.0 or better, you can simply concatenate all XML-files (minus the headers) add the directory name, provide a link to a suitable XSLT-file and have it transformed client-side.


    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://313047]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (9)
As of 2021-03-07 11:46 GMT
Find Nodes?
    Voting Booth?
    My favorite kind of desktop background is:

    Results (120 votes). Check out past polls.