Do you know where your variables are? | |
PerlMonks |
Re^5: Remove duplicate from the same line..by sundialsvc4 (Abbot) |
on Jun 02, 2013 at 17:05 UTC ( [id://1036601]=note: print w/replies, xml ) | Need Help?? |
By suggesting a “separate file with a list of replacements,” I think that you just hit the nail on the head. This is obviously a human-generated list, with variations in names that (humans know ...) refer to the same legal entity. It would be quite difficult to write a completely satisfactory algorithm to “conclude that” some particular replacement should be done. But, if you could provide a (human-generated and human-maintained) list of the replacements, then you could not only sanitize the list effectively, but you could also control and guide its operation.
For example, let’s say that you have a data-file containing records such as: Finally, a filter-program could be constructed which scans the file for strings which contain more-than-one occurrence of the same alphanumeric token, e.g. Goldman. A human would eyeball that list and add to the substitutions-file as he or she deems fit.
In Section
Seekers of Perl Wisdom
|
|