comment on

here's a question for the best of you ...
maybe this is an algorythm question, i'm not sure,
i've got a pattern that may change slightly from one
situation to another, let's call it title

this title can be passed from one person to another or can change slightly and still be the same thing.

i've got to run a match on that title over a period
of time and run a calculation based on a data-element
at that period of time.
I'm using a multi dimentional hash of arrays

here is the code for my that statement which should give a better understanding of the data-structure and what i'm trying to do ...

sub Compress {        
   foreach my $KEYmonth  (sort keys %{$_[0]}) {
   foreach my $KEYcat (sort keys %{$_[0]{$KEYmonth}}) {#cat= category
   foreach my $KEYsubcat (sort keys %{$_[0]{$KEYmonth}{$KEYcat}}) { 
        $i=0;     
        foreach my $value (@{$_[0]{$KEYmonth}{$KEYcat}{$KEYsubcat}}) {
+ 
             $OUTPUT{'YTD'}{$KEYcat}{$KEYsubcat}[$i++] += $value;     
+                      
        }    
   }      
   }        
   }
}
[download]

i do this over a number of months for many different
categories, and sub-categories. Unfortunately, in one (or two)
of the situations the subcategory-keys are variable enough
to prevent a positive match.

the data could be:
"INITIAL.LASTNAME(x)(LONG NAME TITLE)" one month and:
"INTL.SOMEOTHERNAME(x)(LONG NAME TITLE)" or:
"INITIAL.LASTNAME(x)(LONG NAME. TITLE)" the next.

that's a mightly long setup for the question but i'm
trying to make it both specific to my problem and generalizable
to the world (sortof like something you could put into a textbook
example if it was well answered, eh merlyn ? *jk* ;)

the question to this is then, how would one go about designing a
matching process that could calculate a confidence interval for NEAR-MATCHES
and in the event of finding this close match use it (and its key) for the
calculation.

this is intended to be a fun interesting exercise, i've tried to
write a NEAR-MATCH function before (and failed miserably).
I'm looking to learn some process or method for tackling
this sort of problem (which i can only assume others will have had experience with)
but i might not necessarily do it (i'm allowed to tell them 'it can't be done in
situation x').

In reply to pattern matching with heuristics by Buckaroo Buddha

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


good chemistry is complicated, and a little bit messy -LW
	PerlMonks