Unfortunately, that doesn't seem to me to eliminate the problem entirely.
Win8 Strawberry 5.8.9.5 (32) Tue 09/13/2022 15:47:56
C:\@Work\Perl\monks
>perl
use strict;
use warnings;
# use Data::Dump qw(dd); # for debug
my $unitslist = join "|", qw(
1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 130
+2
1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 160
+4
1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 200
+2
2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 230
+4
2401 2402 2403 2501 2502 2503 2504 2505
);
my $text =
"Visit units 1101 or 2202, call us at 555-555-2202 and
call before 13 Aug, 2202\n";
$text =~ s{ \b($unitslist)\b }
{\n<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a
+>\n}xg;
print $text;
^Z
Visit units
<a href="apartments.pl?do_what=view&unit=1101"><b>1101</b></a>
or
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
, call us at 555-555-
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
and
call before 13 Aug,
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
Indeed, it doesn't seem as if the problem can be entirely eliminated unless input text can be specified to be much more specialized. E.g., uniquely delimit all unit number sub-strings:
%1234% or
{{1234}}. This would also allow for easy support of unit numbers like
123A or
12-B.
It's possible to somewhat mitigate the problems associated with completely free-form text by adding more boundary conditions.
Win8 Strawberry 5.8.9.5 (32) Wed 09/14/2022 0:21:27
C:\@Work\Perl\monks
>perl
use strict;
use warnings;
# use Data::Dump qw(dd); # for debug
my ($rx_all_units) =
map qr{ (?<! [-.:]) \b (?: $_) \b (?! [-.:]) }xms,
join '|',
reverse sort
qw(
1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 1
+302
1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 1
+604
1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 2
+002
2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 2
+304
2401 2402 2403 2501 2502 2503 2504 2505
);
my $text =
"Visit units 1101 or 2202, call us at 555-555-2202 and
call before 13 Aug, 2202\n";
$text =~ s{ ($rx_all_units) }
{\n<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a
+>\n}xg;
print $text;
^Z
Visit units
<a href="apartments.pl?do_what=view&unit=1101"><b>1101</b></a>
or
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
, call us at 555-555-2202 and
call before 13 Aug,
<a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
Give a man a fish: <%-{-{-{-<
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.