I have a medium sized file of about 6,000 records. They are diagnosis codes for emergency room patient visits, but that's beside the point. Consider them a string of numbers arranged like:
There can be as few as 2 numbers in each record or as many as 10. I need to filter out a certain subset of records. In each set (row) of numbers there is at least one number between 296.1 and 314.0. However, if the row contains "305.1", the record can be excluded if "305.1" is the only number between 296.1 and 314.0 in the row. In the sample rows above, row 1 would be discarded and rows 2 and 3 kept.
The person requesting this data suggested that I dump it into an excel spreadsheet, sort the data and manually remove the records that I don't need. That seems way too labor intensive to me, and I would think that Perl would have some quick and easy way to sort this out. I can't quite get my mind around the best way to do it, however.
I was thinking that I'd read each row and use a regular expression to find rows with 305.1 and then check for the existence of another qualifying number. Based on that I could then either delete the rows I don't need or save the ones I want to a new file. I'm a little rusty with Perl right now, and I don't even know how to start. I thought if I organized it into a node and tossed it out here that some discussion might help get me going. I'd appreciate any suggestions. Thanks.