I am trying to get my mind around negative look behinds. This is something that I am working on but did not write.
I think that it looks for the number of spaces to remove the right txt. If I use the script as written it does NOT grab the STATE ZIPCODE section.
If I change
$addr =~ s/(?<![|\s])\s{25,}[^|]+//g; # extra right-text
to
$addr =~ s/(?<![|\s])\s{27,}[^|]+//g; # extra right-text
then i get the STATE and ZIPCODE
this is the example script
use strict;
use warnings;
my $addr = '|
+ Note 00
+001| FIRST LAST NAME| ADDRESS 1
+ Interest Rate
+ 5.450000| CITY STATE ZIPCODE|
+ YTD Interest $4,442.64|
Total Payment Amount
+ $886.00|
+ Escrow Portion $
+344.49|';
print "address to parse : \n $addr \n";
$addr =~ s/\|\s{25,}[^|]+//g; # rm spaces left
$addr =~ s/(?<![|\s])\s{27,}[^|]+//g; # extra right-text
$addr =~ s/\s*\|\s*/|/g;
$addr =~ s/\|{2,}/|/g;
$addr =~ s/\s+\|$//;
$addr =~ s/\s+/ /g;
# use up to 6 of last lines for addr
$addr = join('|', (split('\|', "||||||$addr"))[-6..-1]);
$addr =~ s/^\|+//;
print "Result addr : \n $addr \n";
Output
c:\Users\collinsc\dev>perl lookBehind.pl
Address to parse :
|
+ Note 00001|
+FIRST LAST NAME| ADDRESS 1
Interest Rat
+e 5.450000| CITY
+ STATE ZIPCODE|
+ YTD Interest $4,442.64|
Total Payment Amount
+ $886.00|
+ Escrow Portion $
+344.49|
Result addr :
FIRST LAST NAME|ADDRESS 1|CITY STATE ZIPCODE
Question If the spaces are outside of the negative look behind section how is this working?