http://qs321.pair.com?node_id=21360

Buckaroo Buddha has asked for the wisdom of the Perl Monks concerning the following question:


here's a question for the best of you ...
maybe this is an algorythm question, i'm not sure,
i've got a pattern that may change slightly from one
situation to another, let's call it title

this title can be passed from one person to another or can change slightly and still be the same thing.

i've got to run a match on that title over a period
of time and run a calculation based on a data-element
at that period of time.
I'm using a multi dimentional hash of arrays

here is the code for my that statement which should give a better understanding of the data-structure and what i'm trying to do ...

sub Compress { foreach my $KEYmonth (sort keys %{$_[0]}) { foreach my $KEYcat (sort keys %{$_[0]{$KEYmonth}}) {#cat= category foreach my $KEYsubcat (sort keys %{$_[0]{$KEYmonth}{$KEYcat}}) { $i=0; foreach my $value (@{$_[0]{$KEYmonth}{$KEYcat}{$KEYsubcat}}) { + $OUTPUT{'YTD'}{$KEYcat}{$KEYsubcat}[$i++] += $value; + } } } } }

i do this over a number of months for many different
categories, and sub-categories. Unfortunately, in one (or two)
of the situations the subcategory-keys are variable enough
to prevent a positive match.

the data could be:
 "INITIAL.LASTNAME(x)(LONG NAME TITLE)" one month and:
 "INTL.SOMEOTHERNAME(x)(LONG NAME TITLE)" or:
 "INITIAL.LASTNAME(x)(LONG NAME. TITLE)" the next.

that's a mightly long setup for the question but i'm
trying to make it both specific to my problem and generalizable
to the world (sortof like something you could put into a textbook
example if it was well answered, eh merlyn ? *jk* ;)

the question to this is then, how would one go about designing a
matching process that could calculate a confidence interval for NEAR-MATCHES
and in the event of finding this close match use it (and its key) for the
calculation.

this is intended to be a fun interesting exercise, i've tried to
write a NEAR-MATCH function before (and failed miserably).
I'm looking to learn some process or method for tackling
this sort of problem (which i can only assume others will have had experience with)
but i might not necessarily do it (i'm allowed to tell them 'it can't be done in
situation x').