Just another Perl shrine | |
PerlMonks |
Regex Questionby shemp (Deacon) |
on Nov 15, 2005 at 19:31 UTC ( [id://508753]=perlquestion: print w/replies, xml ) | Need Help?? |
shemp has asked for the wisdom of the Perl Monks concerning the following question:
Hi all, i've been trying to construct a regex to help with cleaning up names that can appear in some pretty nasty formats, it's raw data from government agencies.
Anyway, one thing i do is look for various 'care of' variants and standardize them to the % symbol. That is done through this regex:
(yes i am looking for the case of 'C%O' which i also see sometimes) And then i want to replace forward of backslashes with &, as long as they do not have digits on both sides (which would be a fraction, which does sometimes appear in names i process). That is done through this regex: The problem is that sometimes i want to perform only the second transformation without having done the first one, but i could not come up with any decent way to accomplish the second part with the exception of cases that are care-of's, as defined by the first part. Any thoughts? Update: I got this working effectively, but then realized that my spec gets even worse, because if i see something like "D/B/A", i want to change that to "DBA" (Doing Business As), so this adds a completely new twist onto when and when not to replace the slashes. So i added this regex after the care of regex: Also, i think that using the separate regexes will work fine, i have worked around the problem of only wanting to perform the final stage without messing the earlier stages. Thanks for all the suggestions! I use the most powerful debugger available: print!
Back to
Seekers of Perl Wisdom
|
|