When I am splitting the key to lat long, it gets split with the
number, not the decimal part.
I'm not the best one to help with XML stuff, but I can perhaps offer
help on this. I think the best approach is not to split away
what you do not want, but to extract what you do want, real numbers in
this case. The Regexp::Common module is useful in this, for it
defines a real-number pattern.
Win8 Strawberry 5.30.3.1 (64) Tue 01/12/2021 14:15:24
C:\@Work\Perl\monks
>perl -Mstrict -Mwarnings
# pm#11126809
use Regexp::Common qw(number);
my $rx_lat_lon = qr{ (?<! \d) $RE{num}{real} (?! \d) }xms;
for my $lat_lon (
'123,456', '12.345, -0', ' -2. , 0.987', ' -.0 ,0.987 ',
'0, -0', '.0 ,-.0', '0. , -0.', ' 0.0 , -0.0 ',
'1, -1', '.1 ,-.1', '1. , -1.', ' 1.2 , -1.2 ',
'123 456', '123', 'foo', '123,foo', 'foo,123', '1.234.5',
) {
my $got_lat_lon =
my ($lat, $lon) =
$lat_lon =~ m{ \A \s* ($rx_lat_lon) \s* , \s* ($rx_lat_lon) \s* \z
+ }xms;
# $lat_lon =~ m{ $rx_lat_lon }xmsg; # no data validation
if ($got_lat_lon) {
printf "%18s -> lat %-10s lon %-10s \n",
"'$lat_lon'", n_or_undef($lat), n_or_undef($lon);
}
else {
print "'$lat_lon' FAILED to extract lat/lon \n";
}
}
sub n_or_undef { return defined $_[0] ? "'$_[0]'" : 'undef'; }
^Z
'123,456' -> lat '123' lon '456'
'12.345, -0' -> lat '12.345' lon '-0'
' -2. , 0.987' -> lat '-2.' lon '0.987'
' -.0 ,0.987 ' -> lat '-.0' lon '0.987'
'0, -0' -> lat '0' lon '-0'
'.0 ,-.0' -> lat '.0' lon '-.0'
'0. , -0.' -> lat '0.' lon '-0.'
' 0.0 , -0.0 ' -> lat '0.0' lon '-0.0'
'1, -1' -> lat '1' lon '-1'
'.1 ,-.1' -> lat '.1' lon '-.1'
'1. , -1.' -> lat '1.' lon '-1.'
' 1.2 , -1.2 ' -> lat '1.2' lon '-1.2'
'123 456' FAILED to extract lat/lon
'123' FAILED to extract lat/lon
'foo' FAILED to extract lat/lon
'123,foo' FAILED to extract lat/lon
'foo,123' FAILED to extract lat/lon
'1.234.5' FAILED to extract lat/lon
(I haven't tested it, but I think this code will work under
virtually any Perl version.
(Update: Counter-examples
are welcome!))
Some comments:
-
The $RE{num}{real} pattern is not bounded (by design!), so defining
an explicit $rx_lat_lon bounded pattern with
my $rx_lat_lon = qr{ (?<! \d) $RE{num}{real} (?! \d) }xms;
is IMHO good practice and gives you a single point at which to
change the definition of a lat/lon value should this become
necessary.
-
Defining an explicit pattern for matching a lat/lon field with
m{ \A \s* ($rx_lat_lon) \s* , \s* ($rx_lat_lon) \s* \z }xms
is again good practice IMHO because it can provide for a high degree
of data validation. The $got_lat_lon flag is true if valid
data is extracted.
-
If you are certain there will always be exactly two lat/lon
sub-fields present and you don't care what else is there, you can
use the simpler
m{ $rx_lat_lon }xmsg
matching expression (note the /g modifier). The $got_lat_lon
flag becomes almost meaningless in this case.
Give a man a fish: <%-{-{-{-<