http://qs321.pair.com?node_id=731438

Bunta has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone,

I am very, very new to Perl and programming, but have been asked by my company to learn the basics (by myself, with no texts!) to help in my job.

I am trying to pluck out all the lines containing a certain word from a large text file and copy them to a new text file.
I have succeeded in doing so, but I have realized that I need to also pluck out the line PREVIOUS to the each of the ones I have already been able to grab.

Here is the code I used to grab all the lines containing "adj : "

open(FILE, "<WordNet.txt"); my @array = <FILE>; close(FILE); open(FILE, ">>WordNetTest2.txt"); my @array2 = <FILE>; print "Extracting adjectives\n"; foreach $_ (@array) { @array2 = grep {$_ =~ "adj : "} (@array); print FILE "@array2"; close(FILE); exit; }

Here is a sample of what the lines look like from the text file:
-----------------
Fast
adj : quick, speedy
-----------------
I've grabbed all the "adj : " lines, but the others have eluded me!
Would there happen to be anyone who could instruct me on how to take out the previous lines along with all lines containing "adj : "?

Thank you very much in advance for any help.

-Brad

Replies are listed 'Best First'.
Re: Using grep to pick out lines in a text file
by Cody Pendant (Prior) on Dec 19, 2008 at 03:21 UTC
    You're reading the whole file into an array, which you don't need to do, if the entries are on different lines. Just read the document line by line.

    I'd also say you're muddled about arrays: for each item in the array, you grep through each item in the array!

    This is how I'd do it:

    use strict; use warnings; open( INPUTFILE, "<WordNet.txt" ) or die "$!"; open( OUTPUTFILE, ">>WordNetTest2.txt" ) or die "$!"; print "Extracting adjectives\n"; my $previous_line; while (<INPUTFILE>) { if ( $_ =~ m/^Adj:/ ) { print OUTPUTFILE $previous_line, $_; print "found one!\n"; } $previous_line = $_; }


    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
      Hi Cody Pendant

      Thanks for your help! Your coding works perfectly and I've been able to change it a bit to do some other entries in the same file as well. I really appreciate your help.

      Yes, I have only been studying about Perl for around 7 days now, and it's a bit much to take in, so the more complicated (for me!) things are still eluding me. I'll keep on chugging along though!

      Thanks again for the quick reply.

      -Brad

Re: Using grep to pick out lines in a text file
by balakrishnan (Monk) on Dec 19, 2008 at 09:14 UTC
    This could be a useful if you are wondering the solution with minimal number of lines.
    open( INPUTFILE, "<WordNet.txt" ) or die "$!"; open( OUTPUTFILE, ">>WordNetTest2.txt" ) or die "$!"; grep { /adj:/ and print OUTPUTFILE $prev,$_;$prev=$_ } (<INPUTFILE>) +;

      I disagree that this is at all helpful for someone who is just picking up the language (and programming at that!).

      Also, isn't grep going to read in the entire contents of the file? That is completely unnecessary for this task.

Re: Using grep to pick out lines in a text file
by ig (Vicar) on Dec 19, 2008 at 16:52 UTC

    You seem to be learning quickly!

    Just in case you haven't already found it: you may find Getting Started with Perl helpful. There is some very good advice there, including references to texts and other on-line resources.

Re: Using grep to pick out lines in a text file
by zentara (Archbishop) on Dec 19, 2008 at 17:14 UTC
    Not addressing the "save previous lines" issue, but to help you with Perl searching in general. There are alot of Perl file greppers around, like vfgrep and peg - Perl _expression_ (GNU) grep script and Gtk2 Visual Grep

    I use this all the time for searching. There are alot of things to consider, like just searching the filenames itself, skipping binary files, recursion, using a regex pattern instead of literal words, precompiling your regex for efficiency,etc.

    #!/usr/bin/perl use warnings; use strict; use File::Find; $|++; # defaults are case-insensitive, no recurse, open and search files (no +t filename) if ($#ARGV < 0){die "Usage zgrep 'pattern' c(case sensitive optional) +r(recurse optional) n(search name only optional) examples: zgrep 'debug me' r will recursively search all files for 'debug me'\n"; } my ($recurse, $name, $case) =(0,0,0); if( grep{/\bn\b/} @ARGV ){@ARGV = grep { $_ ne 'n' } @ARGV; $name = 1 +}; if( grep{/\br\b/} @ARGV ){@ARGV = grep { $_ ne 'r' } @ARGV; $recurse = + 1 }; if( grep{/\bc\b/} @ARGV ){@ARGV = grep { $_ ne 'c' } @ARGV; $case = 1 +}; #print "$name $recurse $case @ARGV\n"; my $path = '.'; #only accept 1 search string, so quote phrases my $search = $ARGV[0]; my $regex; #defaults to case insensitive if ($case){$regex = qr/\Q$search\E/} else{$regex = qr/\Q$search\E/i} # use s modifier for multiline match find (sub { #skip directories which begin with 1 if (-d && $_ =~ /^1.*$/) { $File::Find::prune = 1; return; } if( ! $recurse ){ my $n = ($File::Find::name) =~ tr!/!!; #count slashes in file return $File::Find::prune = 1 if ($n > 1); } return if -d; return unless (-f and -T ); # don't waste time on binaries if($name){ if ($_ =~ /$regex/){print "$File::Find::name\n"}; }else{ open (FH,"< $_"); while(<FH>){ print "$File::Find::name: $. :$_\n " if /$regex/; } close FH; } }, $path); exit;

    I'm not really a human, but I play one on earth Remember How Lucky You Are
Re: Using grep to pick out lines in a text file
by missingthepoint (Friar) on Dec 19, 2008 at 23:49 UTC

    I think you'd find Beginning Perl useful. It's freely available online.


    Life is denied by lack of attention,
    whether it be to cleaning windows
    or trying to write a masterpiece...
    -- Nadia Boulanger