Re: Fastest way to minimally check that file contains perl code?

Perl is notorious for being able to parse stuff which looks like garbage, there's a whole category for Obfuscated code on PerlMonks. So let's hope your Perl programmers do this in less than 20% of their files ;)

For the general task of classifying data, there's AI::NaiveBayes and AI::Categorizer. They both need some adaption to parse text into the categories "Perl source code" and "garbage". I would guess that you get 80% accuracy with a filter based on the regular expressions presented by other monks, so only if this fails, training a Bayesian might be an alternative.

Comment on Re: Fastest way to minimally check that file contains perl code?

Replies are listed 'Best First'.
Re^2: Fastest way to minimally check that file contains perl code? by LanX (Saint) on Mar 13, 2020 at 16:28 UTC
> there's a whole category for Obfuscated code on PerlMonks On a side note: It's possible to run Perl::Tidy in a server mode, which is far faster than starting it up for each file. Though I doubt it's faster than `perl -c` , unless using/requiring a large tree of dependencies (like Moose) is causing the lag here. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l]
Re^3: Fastest way to minimally check that file contains perl code? by hippo (Bishop) on Mar 13, 2020 at 16:46 UTC
The Moose argument is why I like vr's idea. If `perl -c` doesn't bomb out early on you can be pretty confident it is actually compiling Perl (with Moose or something equally heavy).	[reply] [d/l]
Re^4: Fastest way to minimally check that file contains perl code? by LanX (Saint) on Mar 13, 2020 at 17:22 UTC
> it is actually compiling Perl (with Moose or something equally heavy) Well, after the timeout you'd only have proven (again and again) that Moose contains Perl, the file in question could still be just garbage starting with `use Moose` But yeah, this should be sufficient for the 80% threshold. :) Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l]


more useful options
	PerlMonks