Re: Searching in binary files
by Corion (Patriarch) on Dec 14, 2005 at 18:05 UTC
|
my @lines;
$/ = \80; # assume a block size of 80
while (<$file>) {
push @lines, $_;
my $str = join "", @lines;
if ($str =~ /searchword/) {
my $loc = tell $file - pos $str;
print "Found a match starting after $loc.\n";
};
if (@lines > 2) {
shift @lines
};
};
Instead of looking for a match in just one "line", you look for a match in the "line" and the "line" after it. | [reply] [Watch: Dir/Any] [d/l] |
|
while ($str =~ /searchword/g) {
my $loc = tell($fh) - length($str) + pos($str);
print "Found a match starting after $loc.\n";
}
Update: That doesn't quite work either... it finds too many occurrences... | [reply] [Watch: Dir/Any] [d/l] [select] |
|
while ($str =~ /$pattern/g) {
my $loc = tell($fh) - length($str) + pos($str) - length($patte
+rn);
print "Found a match starting after $loc.\n";
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: Searching in binary files
by jdporter (Paladin) on Dec 14, 2005 at 18:05 UTC
|
local $/ = "find this string";
while (<FILE>)
{
if ( chomp )
{
# you know you found an occurrence
}
}
We're building the house of the future together.
| [reply] [Watch: Dir/Any] [d/l] |
|
That is rather a nice idea, except: whether one really could do this depends on file size, and whether the process happens to reach either the target string or end-of-file before it runs out of memory.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Searching in binary files
by BrowserUk (Patriarch) on Dec 14, 2005 at 19:22 UTC
|
#! perl -slw
use strict;
open I, '<:raw', $ARGV[0] or die $!;
my $regex = $ARGV[1] or die 'No search pattern supplied.';
my $o = 0;
my $buffer;
## Read into the buffer after any residual copied from the last chunk
while( my $read = read I, $buffer, 4096, pos( $buffer )||0 ) {
while( $buffer =~ m[$regex]gc ) {
## Print the offset, the matched text plus (following) context
print $o + $-[0], ':', substr $buffer, $-[0], 100;
}
## Slide the unsearched remainer to the front of the buffer.
substr( $buffer, 0, pos( $buffer ) ) = substr $buffer, pos( $buffe
+r );
$o += $read; ## track the overall offset.
}
close I;
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Searching in binary files
by jdporter (Paladin) on Dec 15, 2005 at 15:51 UTC
|
You don't say what kind of file it is; so, if you do know anything about the structure of the file, you might to well to take a look at Tie::MmapArray. It works well for files which are strictly arrays of C-struct type data records; in general, such things look like binary to perl.
Then you can simply iterate over the array of structs, and test the various fields for your pattern. You may even know which of the fields might and might not contain what you're searching for.
We're building the house of the future together.
| [reply] [Watch: Dir/Any] |
Re: Searching in binary files
by pileofrogs (Priest) on Dec 14, 2005 at 19:01 UTC
|
This is probably an uncool suggestion, but you could always pipe your data through strings if you're on a unixy system.
| [reply] [Watch: Dir/Any] |
Re: Searching in binary files
by GrandFather (Saint) on Dec 15, 2005 at 02:54 UTC
|
What do you actually want to do? Check that the string exists? Count the number of occurences? Find a string that matches some pattern? Find a prefix string and extract some trailing text?
A neat way to perform some of those searches is:
local $/ = "the string to match";
while (<fileHandle>) {
#do stuff with the "line" in $_
#chomp will remove "the string to match";
}
DWIM is Perl's answer to Gödel
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] [d/l] |
Re: Searching in binary files
by zentara (Archbishop) on Dec 15, 2005 at 12:32 UTC
|
How about a "sniffing dog" approach(or fingerprints?). Break your search string into small fragments, but long enough to be fairly unique. Then search for those fragments in a small sliding chunks of the big file. If a fragment is found, check to see if the adjacent fragments are there. If the fragment is found near the beginning or end of the sliding chunk, load in the appropriate adjacent chunk and retest.
I'm not really a human, but I play one on earth.
flash japh
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
|