Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Best way to a parse a big binary file

by pwagyi (Monk)
on Dec 02, 2019 at 08:27 UTC ( [id://11109533] : note . print w/replies, xml ) Need Help??


in reply to Best way to a parse a big binary file

I would not recommend reading whole file into memory since you said it would be big binary file :) I would have classes for each type MainHeader, ExtensionHeader, PartOne, PartTwo, etc. Each class constructor would take binary data as parameter.

When you say each section has a field that knows length, does it mean it is Tag-Length-Value? https://en.wikipedia.org/wiki/Type-length-value in that case, you could have main loop in parser class and pass each record binary chunk(from length) to appropriate class based on tag.

Parser class would be something like iterable; where client invoke next()/ (or ->() in perl land) method to advance/get next record from file.

#pseudo code # error handling omitted! sub parser_factory { my $file_path = shift; my %options = @_; fh = open_file($file_path) my $iterator = sub { # closure ; flag end of data/closing file omitted while( header = read(fh)) { length = get_length(header) body = read(fh,length) type = get_type(header) class =get_record_class(header) # return class name return class->new(body); } } return $iterator }