dynamic zcat and grep

clmcshque has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: dynamic zcat and grep
by swampyankee (Parson) on Mar 21, 2006 at 21:50 UTC

I believe this line:

open INFILE, 'zcat $filename  |' or print "ERROR - Could not open $fil
+ename";
[download]

emc

" The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents."
—Nathaniel S. Borenstein

[reply]
[d/l]

Re^2: dynamic zcat and grep

by clmcshque (Initiate) on Mar 21, 2006 at 22:38 UTC

open(INFILE, "zcat $filename |")
open(INFILE, 'zcat $filename |')
open INFILE, "zcat $filename  |"
[download]

zcat: compressed data not read from a terminal. Use -f to force decomp
+ression.
For help, type: zcat -h
[download]

[reply]
[d/l]
[select]

Re^3: dynamic zcat and grep

by swampyankee (Parson) on Mar 22, 2006 at 03:29 UTC

Sorry my help was not helpful.

I've looked at the docs for zcat (well, OpenBSD's docs for zcat) and it seems that your first and third choices should work: zcat unzips the input file to STDOUT. When I get a chance (probably about 24 hours from now), I'll muck about on by OpenBSD box to see if I can replicate your results.

emc

" The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents."
—Nathaniel S. Borenstein

[reply]

Re^3: dynamic zcat and grep

by johngg (Canon) on Mar 22, 2006 at 11:23 UTC

% cat file.gz | zcat | less
[download]

This works on Solaris but if it doesn't work for you it could be that your zcat demands the presence of a terminal as implied by your results.

Cheers,

JohnGG

[reply]
[d/l]

Re^4: dynamic zcat and grep

by clmcshque (Initiate) on Mar 22, 2006 at 17:08 UTC

Re^2: dynamic zcat and grep

by Anonymous Monk on Oct 03, 2012 at 02:13 UTC

This is an old thread but this answer may prove useful to others, I had a similar problem with a command similar to

zcat -c test.gz | pythonscript.py > test.out

being called within a makefile, it had the same error as mentioned in this thread. I ended up changing it to

cat test.gz | zcat -c | pythonscript.py > test.out

Regards Brad

[reply]
[d/l]
[select]

Re: dynamic zcat and grep
by johngg (Canon) on Mar 21, 2006 at 23:32 UTC

use strict;
use warnings;

use Compress::Zlib;

# Set up what we want to match.
#
our $reg1 = "Joe Sinclair";
our $reg2 = "Bill Halburg";
our $rxNames = qr{(?:$reg1|$reg2)}

# Open compressed log file.
#
our $logFile = "file.gz";
our $gzInput = gzopen($logFile, "rb")
   or die "gzopen: $gzerrno\n";

# Read line by line into $_ counting bytes read.
#
our $bytesRead;
while($bytesRead = $gzInput->gzreadline($_))
{
    # Print if it matches.
    #
    print if /$rxNames/;
}

# Check that we have read to the end. Close
# file.
#
die "Incomplete read: $gzerrno\n" unless
   $gzerrno == Z_STREAM_END;
$gzInput->close();
[download]

I have not tested this but I have adapted it from a script doing somthing similar.

Cheers,

JohnGG

[reply]
[d/l]

Re^2: dynamic zcat and grep

by clmcshque (Initiate) on Mar 22, 2006 at 00:40 UTC

Ah yes, without the ability to install the module, I cannot try this, I will however test it on another machine. Thank you for the input.

[reply]

Re^2: dynamic zcat and grep

by clmcshque (Initiate) on Mar 23, 2006 at 17:26 UTC

Thanks, I got the module installed, and with a bit of tweaking it all works fine now.

[reply]

Re: dynamic zcat and grep
by graff (Chancellor) on Mar 22, 2006 at 06:10 UTC

PerlIO::gzip

use PerlIO::gzip;

open( ICMP, "<:gzip", "sometext.gz" );
open( OCMP, ">:gzip", "chosenlines.gz" );
while (<ICMP>) {
    print OCMP if /something matches/;
}
close OCMP;
[download]

I can't wait for this to be part of the core distro.

(update: corrected the spelling on the cpan link)

UPDATE: (2010-10-18) It seems that PerlIO::gzip should be viewed as superseded by PerlIO::via:gzip. (see PerlIO::gzip or PerlIO::via::gzip).

[reply]
[d/l]

Re^2: dynamic zcat and grep

by clmcshque (Initiate) on Mar 28, 2006 at 18:36 UTC

#!/usr/local/bin/perl
use Time::Local 'timelocal';
use PerlIO::gzip;
use IO::Tee;
use IO::File;

$err = 0;
$help = 1 if($ARGV[0] eq '-h');
$help = 1 if($ARGV[0] eq '--help');
$help = 1 if($ARGV[0] eq '-help');
$help = 1 if($ARGV[0] eq '');
$debug = 1 if($ARGV[0] eq '-d');

$msgHelp = "FORMAT - command [-d][-h][--help] Month StartDate EndDate\
+n\tStart & End Date = mm/dd/yyyy";
$msgGreps = "\n----------------------The following greps will be used 
+for searching:\n";
$msgFiles = "\n----------------------The following files will be searc
+hed based on the dates given:\n";
$msgStarting = "\n----------------------Now Starting\n";


if($help == 1){
    print $msgHelp;
} elsif($debug == 1){
    $month = $ARGV[1]; 
    @start = split /\//, $ARGV[2];
    @end = split /\//, $ARGV[3];
}else{
    $month = $ARGV[0]; 
    @start = split /\//, $ARGV[1];
    @end = split /\//, $ARGV[2];
}

$inputpath = "/logs/";
$startdate = timelocal(0,0,0, $start[1], $start[0]-1, $start[2]-1900);
$enddate = timelocal(0,0,0, $end[1]+1, $end[0]-1, $end[2]-1900);
$currenttime = localtime time;
$fcount = 1;
$gcount = 0;

if($debug !=1){$logfile = "win_greplog.txt" }else{$logfile = "testlogf
+ile.txt"};
$msgstarting = "\n----------------------$currenttime------------------
+-----\nParse will start with logs dated: startdate = $startdate\nEndi
+ng with logs dated: enddate  = $enddate\nIn the following directory: 
+$inputpath\n";
$tee = new IO::Tee(\*STDOUT, new IO::File(">>$logfile"));

print $tee "\nDEBUG MODE ON" if($debug == 1);
print $tee $msgstarting;

opendir INPUTDIR, $inputpath;
    @inputfiles = grep {    (stat "$inputpath/$_")[9] >= $startdate an
+d (stat "$inputpath/$_")[9] < $enddate } readdir INPUTDIR;
closedir INPUTDIR; 
$numfiles = @inputfiles;

$greps[0]  = '\SOFTWARE\Microsoft\Windows\CurrentVersion\Run';
$greps[1]  = '\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce';
$greps[2]  = '\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnceEx';
$greps[3]  = '\SOFTWARE\Microsoft\Windows\CurrentVersion\AeDebug';
$greps[4]  = '\SYSTEM\CurrentControlSet\Control\SessionManager\KnownDL
+Ls';
$greps[5]  = '\SYSTEM\CurrentControlSet\Control\SecurePipeServers\winr
+eg';
$greps[6]  = '\SOFTWARE\inAgents\EventLog2Syslog';
$greps[7]  = '%systemdrive%';
$greps[8]  = 'C:\';
$greps[9]  = '\system32';
$greps[10] = '\system32\drivers';
$greps[11] = '\system32\config';
$greps[12] = '\system32\spool';
$greps[13] = '\repair';

print $tee $msgGreps;
foreach $gname (@greps) {
    print $tee "\n - greps[$gcount]\t $gname";
    $gcount++;
}

print $tee $msgFiles;
foreach $filelist (@inputfiles) {
    $filelist = $inputpath.$filelist;
    print $tee "\n - $filelist";
}

print $tee $msgStarting;
# step into each input file
foreach $inputfile (@inputfiles) {
    # step into each grep
    $gcount = 0;
    foreach $grep (@greps) {
        # build the outputfile
        $outputfile = $month."_".$gcount."_".$inputfile."_results.txt"
+;
        @results = `zgrep $grep > $outputfile`;
        $gcount++;
    }
}

print $tee "\n\n----------------------Normal Completion\n" if ($err==0
+);

close(LOGFILE);
[download]

[reply]
[d/l]

Re^3: dynamic zcat and grep

by graff (Chancellor) on Mar 29, 2006 at 03:43 UTC

As for improvements, I can think of several, but if the script works, these are less than crucial -- well, except for the fact that you really should include "use strict", and learn about scoping variables.

Apart from that, in no particular order:

You have "use PerlIO::gzip" at the top, but you never actually use the ":gzip" IO layer. You're just running "zgrep" in backticks.
Actually, looking at the zgrep command line in the backticks, I don't see you providing an input file name there -- just a pattern to search for. I would expect the resulting output files to be empty every time.
You appear to be generating 14 output files for every input file. Is that really what you want? You never actually say what the goal is here, but fourteen separate output files for each input file seems like a lot.

You can simplify and improve your handling of command line options and args. Take a look at Getopt::Std and Getopt::Long -- these are part of the core distribution; also, the following is another alternative (though it doesn't use modules):

my $debug = 0;
my $usage = "Usage: $0 [-d|-h] month start end\n blah blah";
if ( @ARGV and $ARGV[0] =~ /^-+([dh])/ ) {
    shift;
    die $usage if ( $1 eq 'h' );
    $debug++;
}
die $usage unless ( @ARGV == 3 ); # could add more conditions...
[download]

Aside from using $month when naming all those output files, it's not clear what this value is important for. If it's supposed to be different from start and or end dates, how should it be different?
Initializing the @greps array can be a lot simpler (and if flexibility would be useful for you, consider loading the list from a data file, which can be named on the command line):
```
my @greps = qw(\string\1
               \string\1\extra
               \string\2
               %and.so.on%
               );
[download]
```

[reply]
[d/l]
[select]

Re: dynamic zcat and grep
by eff_i_g (Curate) on Mar 21, 2006 at 23:10 UTC

I can only use zcat on .Z files, therefore, I used gunzip -c.
I created a .Z file via compress and used zcat as you have.

zcat -h

[reply]

Re^2: dynamic zcat and grep

by thor (Priest) on Mar 22, 2006 at 00:33 UTC

thulben@alpha:~
17 $ md5sum /bin/gzip /bin/zcat /bin/gunzip
57cd8cdf42fbda6e0a1f5e17ac986b4f  /bin/gzip
57cd8cdf42fbda6e0a1f5e17ac986b4f  /bin/zcat
57cd8cdf42fbda6e0a1f5e17ac986b4f  /bin/gunzip
[download]

thor

The only easy day was yesterday

[reply]
[d/l]

Re^2: dynamic zcat and grep

by clmcshque (Initiate) on Mar 21, 2006 at 23:50 UTC

gunzip: compressed data not read from a terminal. Use -f to force deco
+mpression.
For help, type: gunzip -h
[download]

 ERROR - Could not open log.gz
[download]

[reply]
[d/l]
[select]

Re^3: dynamic zcat and grep

by jasonk (Parson) on Mar 22, 2006 at 00:39 UTC

Since you are on a Red Hat machine anyway, you can save yourself a lot of time just by using 'zgrep'.

We're not surrounded, we're in a target-rich environment!

[reply]

Re^4: dynamic zcat and grep

by clmcshque (Initiate) on Mar 22, 2006 at 20:30 UTC

Re^3: dynamic zcat and grep

by johngg (Canon) on Mar 22, 2006 at 10:50 UTC

$!

open

$fn = "non_existant_file";
open IN, "<$fn" or die "open: $fn: $!\n";
[download]

would error with

open: non_existant_file: no such file or directory
[download]

Cheers,

JohnGG

[reply]
[d/l]
[select]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks