Re: Extracting text after a keyword

If you can afford to read the entire file before processing it, this yould be a simple solution:

use strict;

my $keyword = "START";
my $length  = 20;

my $file = <<EOF;
something START text1
 text2 ........................ text 3
text 4 START more text
still more text
EOF

$file =~ s/(?:\s+|\n+)//gc;

my @hits = $file =~ /$keyword(.{$length})/g;

for (@hits) {
  print $_."\n";
}
[download]

---- kurt

Comment on Re: Extracting text after a keyword Download Code

Replies are listed 'Best First'.

Re: Re: Extracting text after a keyword
by suaveant (Parson) on Jul 02, 2002 at 13:44 UTC

use Benchmark;

sub a {
    $_ = "this is a\ntest\n";
    s/(?:\s+|\n+)//gc;
}
sub b {
    $_ = "this is a\ntest\n";
    s/\s+//gc;
}
sub c {
    $_ = "this is a\ntest\n";
    tr/\n\r\t //d;
}

timethese(250000,{ a => \&a, b => \&b, c => \&c });


Benchmark: timing 250000 iterations of a, b, c...
         a:  5 wallclock secs ( 4.18 usr +  0.01 sys =  4.19 CPU) @ 59
+665.87/s (n=250000)
         b:  1 wallclock secs ( 1.61 usr +  0.03 sys =  1.64 CPU) @ 15
+2439.02/s (n=250000)
         c:  1 wallclock secs ( 0.66 usr +  0.02 sys =  0.68 CPU) @ 36
+7647.06/s (n=250000)
[download]

for one... \s includes \n, so the right part of that doesn't ever actually do anything useful, but I believe it still gets checked each time to make sure it doesn't make a better match... I believe in most cases a character class would be better for that, but that is really irrelevant, since stripping single characters is much faster with tr///d

Just thought I would point it out...

- Ant
- Some of my best work - (1 2 3)

[reply]
[d/l]


P is for Practical
	PerlMonks