http://qs321.pair.com?node_id=11113388

IB2017 has asked for the wisdom of the Perl Monks concerning the following question:

Hello

I used to check if a work needs to be exluded from processing checking if it is contained in a stop words list. I used this method:

my $CkDiscardCommonwords=1;#check if use stopwords or not my $term="word"; my $commonwordsRX = loadCommonWords (); if ($CkDiscardCommonwords eq 1){ if ($term =~ /^(?:$commonwordsRX)$/){ return (0); } } sub loadCommonWords { my @commonwords; my $filename="commonWords.txt"; if (open $FH, "<:encoding(UTF-8)", $filename) { while (my $line = <$FH>) { chomp $line; push @commonwords, $line; } close $FH; } my $commonwordsRX = join "|", map quotemeta, @commonwords; return $commonwordsRX; }

Now my sooftware has changed and the list of common words saved in commonWords.txt may grow exponencially. It used to be small (~300 words), now it could reach x-thousands.

I would like to hear what expert monks think about this implementation. Would a Regex constructed in this way cause problems when it grows? Should I choose another approach?