XML & Schema VS Config::IniFiles

HeatSeekerCannibal has asked for the wisdom of the Perl Monks concerning the following question:

Wise Monks,

I'm about to embark in a rather hairy problem and am in dire need of your enlightened views on it to prevent further suffering.

I'm using Config::IniFiles to create and manage the configuration files for both the client and server side of a distributed application.

I need to be able to read and modify my client side configurations remotely by sending text commands through tcp/ip sockets to listening ports in the remote machines from my server application in my local worst-station.

Of course, the ability to modify implies that I need some validation of what is being written to the config files, since it is what will be driving my (mostly) unsupervised app. And, if possible, of what is being read from the ini files.

Taking advantage of the grouping feature of the Config::IniFiles module, i'm trying to create a hierarchical tree that will follow a certain structure and in which only the final nodes have the most critical information.

And with the purpose of validating that what I write to the ini file as well as what I read from it have the structure I want, I'm creating a hash that outlines the...lets say "logic" structure that I want the ini file to have and a bunch of regexps to match stuff against.

Here's the (hopefully) self-explanatory validating hash:

my %IniStruct = (
        "^MONICA ROOT$" => {
                "^DEBUG$" => "^(YES|NO)$",
                "^BASEDIR$" => {
                        "REGEXP" => "^\/\w+[\/\w]*$",
                        "VALUE" => "-e FILENAME && -d _ && -r _"
                        },
                "^LOGFILE$" => {
                        "REGEXP" => "^\w+[\/\w]*$",
                        "VALUE" => "-e FILENAME && -f _ && -T _ && -r 
+_ && -w _",
                        },
                "^MONICASRVADDR$" => "^\d{1,3}\.\d{1,2}\.\d{1,2}\.\d{1
+,3}$",
                "^CONNECTIONTIMEOUT$" => "^\d{1,2}$"
                },
        "^MONICA CAT=\w+$" => {
                "^ENABLE$" => "^(YES|NO)$",
                "^DEBUG$" => "^(YES|NO)$",
                },
        "^CAT=\w+ SUBCAT=\w+$" => {
                "^ENABLE$" => "^(YES|NO)$",
                "^DEBUG$" => "^(YES|NO)$",
                },
        "^SUBCAT=\w+ (PARAM=[\/\w]+|GROUP=\w+)$" => {
                "^SCRIPT$" => {
                        "REGEXP" => "^\/\w+[\/\w]*$",
                        "VALUE" => "-e FILENAME && -s _ && -r _ && -x 
+_",
                        },
                "^ARGUMENTS$" => ".*",
                "^REGEXP$" => ".*"
                },
 );
[download]

And here's an example of the corresponding ini file:

[MONICA ROOT]
DEBUG=NO
BASEDIR=/export/home/monica/
LOGFILE=log/clntMonica.log
MONICASRVADDR=<<EOT
10.6.1.122:9000
EOT
CONNECTIONTIMEOUT=10

[MONICA CAT=OPSIST]
ENABLE=YES
DEBUG=NO

[MONICA CAT=INFORMIX]
ENABLE=YES
DEBUG=NO

[CAT=OPSIST SUBCAT=SECURITY]
ENABLE=YES
DEBUG=YES

[CAT=OPSIST SUBCAT=FSYSTEM1]
ENABLE=YES
DEBUG=NO

[SUBCAT=SECURITY PARAM=ZEROUID]
SCRIPT=bin/checkZeroUid.pl
ARGUMENTS= <<EOT
EOT
REGEXP=/^\d{1,3}$/

[SUBCAT=FSYSTEM1 PARAM=/]
SCRIPT=shells/fs.sh
ARGUMENTS= <<EOT
/
EOT
REGEXP=/^\d{1,3}$/

[SUBCAT=FSYSTEM1 PARAM=/VAR/RUN]
SCRIPT=shells/fs.sh
ARGUMENTS= <<EOT
/var/run
EOT
REGEXP=/^\d{1,3}$/

[CAT=INFORMIX SUBCAT=RATINGS]
ENABLE=YES
DEBUG=NO

[SUBCAT=RATINGS GROUP=DBSPACE]
SCRIPT=shells/informix.sh
ARGUMENTS= <<EOT
dbspace
EOT
EXPREG=/((^dbspace\s+allocated\s+free\s+pcused$)|(^\w+\s+\d+\s+\d+\s+\
+d{1,3}$)/
[download]

So basically, what I want to do (when reading) is to "walk" the ini file using the hash as a guide to validate both structure (by matching section headers and parameters within against the corresponding regexp in the hash) and content (by matching against a regexp AND replacing the string for the uppercase part in the VALUE nodes and then evaluating it).

When writing to the ini file, i'd again use the hash to validate what i'm about to write.

Of course, in both cases code would be needed to enforce the dependency of, say, a group on a subcategory, a category on the "monica" node...etc.

Now, my first thought was to use XML and a DTD (or even Schema?) for all this. For what i've read its the most logical way to go. But I had a number of problems installing the modules and their dependencies. After a number of days trying I was in a dead end with an expat library (or header, cant remember, drank a lot to forget).

Since these ini files (and their apps) will be installed in a number of machines (with various flavors/versions of unix and various versions of Perl), i thought that having problems to install XML in my machine could very well indicate possible problems when installing my code in other less updated hosts. And only a few of those would have internet connection, so just download what i need from the web is usually not an option.

Believe me, I DID give XML (or at least the modules I tried back then) installation a good try (I have my boss's whip marks on my back to prove it). Granted, I must have done something wrong, but at the time I couldnt spend more effort on it and had to let go.

So I guess a consolidated form of my question would be: would you consider that it is better to invest time and effort in the Config::IniFiles workaround? Or should I suck it up and learn how to create a local repository of all the stuff I need and a reliable method to automate the installation of a bundle that includes the XML modules and all their dependecies, learn to use schema and (conceivably) make my future life easy at the expense of a lot of exploration (waisted time, from my boss's point of view) and pressure at the present time?

Thanks for taking the time to read my question. Made it this long to explain where I'm coming from. Hope you can share your thoughts with me. Thanks in advance.

Heatseeker Cannibal

Comment on XML & Schema VS Config::IniFiles Select or Download Code

Replies are listed 'Best First'.
Re: XML & Schema VS Config::IniFiles by shmem (Chancellor) on Dec 15, 2007 at 00:03 UTC
KISS. Make a version that works. Don't add unnecessary layers. Why XML? Because it's "the right thing to do?" Then you'd have to convert your validation hash into an XSLT file... or isn't that "the right thing to do" any more? I'd send the configuration serialized via Storable, and done. --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply]
Re: XML & Schema VS Config::IniFiles by plobsing (Friar) on Dec 14, 2007 at 21:17 UTC
libexpat is not the only solution to XML parsing (although it is a decent one). There are many Pure Perl (no compile required at installation) solutions as well which should work on most recent versions of perl. I would however use these only as a stop-gap or fallback solution as they tend to be slower.	[reply]
Re^2: XML & Schema VS Config::IniFiles by HeatSeekerCannibal (Beadle) on Dec 14, 2007 at 23:23 UTC
Much obliged. I'll take a look. Heatseeker Cannibal	[reply]
Re: XML & Schema VS Config::IniFiles by eserte (Deacon) on Dec 16, 2007 at 18:29 UTC
Take a look at Kwalify. This is a perl implementation of the same-named schema language for data structures (as opposed to tree-like XML structures). I am not sure if this can be made work with ini files, but it works fine with YAML, JSON and similar serialization formats.	[reply]
Re: XML & Schema VS Config::IniFiles by bart (Canon) on Dec 20, 2007 at 16:00 UTC
There's a difference between regexps and strings. To you, `"^\w+[\/\w]$"` and `/^\w+[\/\w]$/` may look the same, but to Perl, at least in source code, they're not the same. If you want to use a string as a regexp, you should make sure the string is the same as your regexp — for example, when you print it out, what is printed should look like your desired regexp. That means that, at least when using double quotes, you'll have to double our backslashes. And escape that '$'. Or use single quotes, that would solve most of the problems... (except when you want a double backslash, or when it is in front of a single quote. Just double the backslashes, already.) `"^\\w+[\\/\\w]\$"` [download] But using qr is easier. This is one reason why it was invented: to allow you to create a first class regexp object, that you can store in a variable, with the exact same syntax as you'd use when using the regexp directly. `qr/^\w+[\/\w]$/` [download] With `qr`, you don't have to use slashes as the delimiters, you can choose any delimiter (or any pair of delimiters, for paired delimters like "`<>`", "`[]`" and "`()`"), but it'll look even more like a regexp if you do. BTW to the regexp engine, there's not much difference between a string and a qr object. Most of all, it'll be syntax checked where you put it in the source. Some people (like Abigail-II, not the least of Perl hackers) claim that when deeply nesting qr objects, it'll be much slower than strings.	[reply] [d/l] [select]
Re^2: XML & Schema VS Config::IniFiles by HeatSeekerCannibal (Beadle) on Dec 20, 2007 at 19:29 UTC
Yes, I modified my hash accoridngly and it now passes the syntax check. But from what you told me on the chatterbox and from what eserte suggested, i'm now taking a look at JSON. Perhaps I can skip this whole regexp madness. Not that its bad...I kinda like regexps, feel like I understand a bit. Heatseeker Cannibal	[reply]

Back to Seekers of Perl Wisdom