Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Trying to name a file parsing module for CPAN

by fx (Pilgrim)
on Jan 09, 2004 at 01:22 UTC ( [id://319997] : perlquestion . print w/replies, xml ) Need Help??

fx has asked for the wisdom of the Perl Monks concerning the following question:

I want to release a module to CPAN. My module parses a file containing records and returns one object/hash/doesn't-matter-in-this-context per record. Each record contains one or more key:value pairs. Each line of the file contains only one key:value pair.

My problem is that I don't know if this style of file has a particular name and therefore I cannot name the module :)

In case it matters, an example of such a file may look like:

HEADER1:value HEADER2:value ID:value KEY1:value KEY2:value ... ID:value KEY1:value KEY2:value FOOTER1:value

so each record, in this case, has an ID, a KEY1 and a KEY2.

In real life, however, it is more complicated than that. Each record always has an ID, and always starts with an ID in the file (this is how I can separate the records out). However, there are around 10 different KEYs. In each record, the KEYs may appear in a different order:

HEADER1:value HEADER2:value ID:value KEY1:value KEY2:value KEY3:value ID:value KEY2:value KEY1:value KEY3:value ...

and not every record has all the keys:

HEADER1:value HEADER2:value ID:value KEY1:value KEY2:value KEY3:value ID:value KEY3:value KEY2:value ID:value KEY5:value ...

So, what should I call my module? File::Parser::Something? Something::Parser? Parser::Something?

What should the Something be?

(Please note that I am not asking for solutions as to how to parse the file and return the records - I have already done this. I am purely asking about the module's potential name.)

Replies are listed 'Best First'.
Re: Trying to name a file parsing module for CPAN
by flyingmoose (Priest) on Jan 09, 2004 at 14:57 UTC
    I do not mean to offend, but your module sounds like a solution in search of a problem. Why would someone else (who is not you), use this module instead of the other available modules using more standard formats, such as CSV, XML, etc?

    After answering that only then would I consider uploading to CPAN, and then you would know more of how you should name it, knowing you know how others would/could use it.

    I would not create a module that uses a non-standard file format when that format is significantly like other formats. For instance, we can give a nod to YAML, seeing it's substantially different from CSV (and has Data::Dumper like functionality built in), although this looks like an incompatible version of CSV that would only create confusion.

    Thus, my opinion is... leave it off CPAN entirely until the use cases can be rationalized. And until you think your version is superior to other competing formats. The name should come more easily once you know what it is.

      I do not mean to offend
      On the contrary, I welcome your feed back.

      your module sounds like a solution in search of a problem
      My module has solved a problem for me - in my case, at least, there was already a pre-existing problem ;)

      Why would someone else ... use this module
      ...
      I would not create a module that uses a non-standard file format

      Part of my original question was whether the file format I am using has a name. If the style did have a name it would then make a release to CPAN more acceptable as, IMHO, it would mean enough people produce/use it to give it a name. If it had/has a name it would probably has uses and therefore others may benefit.

      Thus, my opinion is... leave it off CPAN entirely until the use cases can be rationalized
      I agree with you - there is no point in clogging up the CPAN directory with a pointless module that no-one will ever need, use or care about.

      I did originally say that I wanted to release it to CPAN. The Perl community (this includes CPAN, Perlmonks and various Perl mailing lists) have done much to assist me over the years. I finally thought that I could give something back. It would appear that this occasion will have to wait until something more suitable comes along.

      Many thanks.

Re: Trying to name a file parsing module for CPAN
by jjhayes84 (Novice) on Jan 09, 2004 at 02:27 UTC
    Well, there's such thing as a CSV file (comma seperated values file) and since your using colons to seperate some of the data, the anacronym still works out. I'd call it Parser::CSV or something like that.

    That's the best thing I can think of.

      My understanding is that a CSV has the values separated by commas and records separated by new lines. If my understanding is correct then I don't have a CSV file.

Re: Trying to name a file parsing module for CPAN
by Anonymous Monk on Jan 09, 2004 at 03:35 UTC
    You mean putting something like this on to CPAN?

    use strict; use Data::Dumper; my @blocks = do { local $/='ID:'; <DATA> }; shift @blocks; my %config; foreach my $line (@blocks) { my ($id) = $line =~ /^(\w+)/; my %keys = $line =~ /^(\w+):(\w+)/gms; $config{$id} = \%keys; } print Dumper(\%config); __DATA__ HEADER1:value HEADER2:value ID:id1 KEY1:A KEY2:B KEY3:C ID:id2 KEY3:E KEY2:F ID:id3 KEY5:G __OUTPUT__ $VAR1 = { 'id1' => { 'KEY3' => 'C', 'KEY1' => 'A', 'KEY2' => 'B' }, 'id2' => { 'KEY3' => 'E', 'KEY2' => 'F' }, 'id3' => { 'KEY5' => 'G' } };

      You mean putting something like this on to CPAN?

      No, I don't.

      I did say in my original post that I wasn't asking for solutions, and that in the context of the question what was returned did not matter.

      My module allows you to specify which of the keys you want returned, which keys are mandatory, whether keys can be repeated for a single ID, what headers key:value pairs are present, what footer pairs, etc.

      It's not as simple as the example I gave - that is why it was an example.

      My question was aimed at trying to find a name for that style of file. I believe that the files I have are being generated from some db running on some kind of mainframe system. Maybe this format is seen frequently in the mainframe world - I don't know - and so I thought I would throw the question to the world.

      Thanks.