http://qs321.pair.com?node_id=11146850


in reply to Text switching

You can replace all unit numbers at once if you build your regex pattern out of them. For instance:

use strict; my $unitslist = join "|", qw( 1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 130 +2 1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 160 +4 1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 200 +2 2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 230 +4 2401 2402 2403 2501 2502 2503 2504 2505 ); my $text = "Visit units 1101 or 2202, call us at 555-555-5555\n"; $text =~ s{ \b($unitslist)\b } {<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a>} +xg; print $text;

Update: Added \b boundary matches to pattern matcher

Good Day,
    Dean

Replies are listed 'Best First'.
Re^2: Text switching
by AnomalousMonk (Archbishop) on Sep 14, 2022 at 05:02 UTC

    Unfortunately, that doesn't seem to me to eliminate the problem entirely.

    Win8 Strawberry 5.8.9.5 (32) Tue 09/13/2022 15:47:56 C:\@Work\Perl\monks >perl use strict; use warnings; # use Data::Dump qw(dd); # for debug my $unitslist = join "|", qw( 1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 130 +2 1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 160 +4 1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 200 +2 2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 230 +4 2401 2402 2403 2501 2502 2503 2504 2505 ); my $text = "Visit units 1101 or 2202, call us at 555-555-2202 and call before 13 Aug, 2202\n"; $text =~ s{ \b($unitslist)\b } {\n<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a +>\n}xg; print $text; ^Z Visit units <a href="apartments.pl?do_what=view&unit=1101"><b>1101</b></a> or <a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a> , call us at 555-555- <a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a> and call before 13 Aug, <a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>
    Indeed, it doesn't seem as if the problem can be entirely eliminated unless input text can be specified to be much more specialized. E.g., uniquely delimit all unit number sub-strings: %1234% or {{1234}}. This would also allow for easy support of unit numbers like 123A or 12-B.

    It's possible to somewhat mitigate the problems associated with completely free-form text by adding more boundary conditions.

    Win8 Strawberry 5.8.9.5 (32) Wed 09/14/2022 0:21:27 C:\@Work\Perl\monks >perl use strict; use warnings; # use Data::Dump qw(dd); # for debug my ($rx_all_units) = map qr{ (?<! [-.:]) \b (?: $_) \b (?! [-.:]) }xms, join '|', reverse sort qw( 1001 1002 1003 1004 1101 1102 1103 1104 1201 1202 1203 1204 1301 1 +302 1303 1304 1401 1402 1403 1404 1501 1502 1503 1504 1601 1602 1603 1 +604 1701 1702 1703 1704 1801 1802 1803 1804 1901 1902 1903 1904 2001 2 +002 2003 2004 2101 2102 2103 2104 2201 2202 2203 2204 2301 2302 2303 2 +304 2401 2402 2403 2501 2502 2503 2504 2505 ); my $text = "Visit units 1101 or 2202, call us at 555-555-2202 and call before 13 Aug, 2202\n"; $text =~ s{ ($rx_all_units) } {\n<a href="apartments.pl?do_what=view&unit=$1"><b>$1</b></a +>\n}xg; print $text; ^Z Visit units <a href="apartments.pl?do_what=view&unit=1101"><b>1101</b></a> or <a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a> , call us at 555-555-2202 and call before 13 Aug, <a href="apartments.pl?do_what=view&unit=2202"><b>2202</b></a>


    Give a man a fish:  <%-{-{-{-<