Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Question about binary file I/O

by TheMartianGeek (Novice)
on Feb 06, 2011 at 00:09 UTC ( [id://886460]=perlquestion: print w/replies, xml ) Need Help??

TheMartianGeek has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I'm trying to write a script that involves reading in data from one binary file, changing it, and outputting the result to a new binary file. To be specific, it reads in 2 bytes at a time from the first .bin file, does a bitwise AND with a specific value (0xFC7F in little-endian, if it matters) and an OR with what is supposed to be a 16-bit-long variable, and then prints the changed bytes the second .bin file. At least, that's what it's supposed to do...I can't get it right for the life of me. Either I'm misunderstanding the read() function, I need a different function in this case, or it's impossible to do this easily. Here is the code I'm using (not the entire program, just part of a subroutine):
$i = 0; $TileData = 0x0000; while(!(eof(IN))) { $BytesRead = read(IN, $TileData, 2); unless($BytesRead != 2) { $NewTileData = (($TileData & 0xFC7F) | ($GFXFileBits)); print OUT $NewTileData; $i++; next; } }
"$GFXFileBits" is the variable that holds the OR'd bits; it should always have a value of 0x0, 0x80, 0x100, 0x180, 0x200, or 0x280. And yes, I know that $i isn't actually used for anything; it was somewhat of a debugging variable that I haven't removed yet. Can anyone help me out here? I've looked all around for more specific documentation on the read() function and binary file I/O, but I haven't turned up much. I know that code is incorrect, but after a long period of trial and error, I haven't really gotten anywhere. Oh, and yes, I am using binmode on both the input and output file handles.

Replies are listed 'Best First'.
Re: Question about binary file I/O
by BrowserUk (Patriarch) on Feb 06, 2011 at 00:32 UTC

    The problem is that you are ANDing and ORing numeric constants with the string value read from the file. If you ahd strict & warnings enabled, you would be getting an message telling you of the problem:

    $s = 'AB'; print $s & 0xFC7F;; Argument "AB" isn't numeric in bitwise and (&) at ... 0

    You either need to unpack the two bytes read from the file to an integer before performing your boolean math:

    $s = unpack 'v', 'AB';; print pack 'v', $s & 0xFC7F;; A@

    And then repack the result before writing to the output file.

    Or, use string constants instead of numeric constants in your boolean math:

    $s = 'AB';; print $s & "\x7F\xFC";; A@

    But note how I had to reverse the order of the bytes in the string constant.

    In both cases, you have to be aware of the byte order used by the system that produces the input file, and the byte order of the system that is running your code in order to ensure that you get the desired result.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Question about binary file I/O
by eyepopslikeamosquito (Archbishop) on Feb 06, 2011 at 00:44 UTC

    The bitwise aspects have been answered already. Re the overall structure of your program, you should use strict and warnings, lexical file handles, and 3-argument open. Also, read is rarely used or needed in Perl programs. Finally, you should set binmode on both files to indicate they are binary files. I suggest something like:

    use strict; use warnings; my $infile = 'yourinputfilename'; my $outfile = 'youroutputfilename'; open(my $fhin, '<', $infile) or die "error: open '$infile': $!"; binmode($fhin); open(my $fhout, '>', $outfile) or die "error: open '$outfile': $!"; binmode($fhout); local $/ = \2; # make <> operator read two bytes at a time while (my $TileData = <$fhin>) { if (length($TileData) != 2) { warn "oops, file is of uneven length, last byte='$TileData'\n"; last; } # do your thing with bitwise stuff here... # my $NewTileData = ... print {$fhout} $NewTileData; }

      BrowserUk: Hm, I thought that might be the problem, that read() was returning a string instead of a number (and it seemed to be the case while debugging). But I looked all over for a function to convert string values to (decimal) numbers and couldn't find one.

      james2vegas: Don't pack and unpack operate only on strings and lists, though?

      eyepopslikeamosquito: I did set binmode on both files...and I had "use strict" in the program originally, but it seemed to do nothing but cause trouble; it mostly meant that I had to put "my" before almost every variable, so I just got rid of it. I'm also not sure what you mean by lexical file handles and 3-argument open. If read is rarely used or needed, though, so much the better; it seems rather rigid and cumbersome. What exactly does $/ do, though? It is defined as "the input record separator", so...does setting it to \number simply tell file handles and the like to return number characters at a time?

        I had "use strict" in the program originally, but it seemed to do nothing but cause trouble; it mostly meant that I had to put "my" before almost every variable, so I just got rid of it. I'm also not sure what you mean by lexical file handles and 3-argument open.
        The "my" is actually very useful because, apart from catching typos in misspelt variable names, it limits the scope of the variable. A variable declared with "my" is known as a lexical variable because it has lexical scope; that is, its name is known from the point of declaration to the end of the block it is declared in (or end of file if not declared in a block). Hence my recommendation to use lexical file handles (e.g. my $fhin) rather than your global variable bareword file handle IN.

        Using lexical file handles is better style because:

        • They are local variables and so avoid the generally evil programming practice of keeping state in global variables.
        • They close themselves automatically when their lexical variable goes out of scope.
        • They avoid the confusion of barewords. IMHO, barewords are an unfortunate Perl feature and should be avoided (except when writing poetry).

        As for why the three-argument form of open is preferred, note that the old two-argument form of open is subject to various security exploits as described at Opening files securely in Perl.

        See also the first four items of Perl Best Practices Chapter 10 (I/O), namely:

        • Don't use bareword filehandles
        • Use indirect filehandles
        • If you have to use a package filehandle, localize it first
        • Use either the IO::File module or the three-argument form of open

        You are giving pack a list, it just happens to have only one element, and you are getting a string back, you can replace 'S*' with 'S1' to make it clearer.

        Update: the 'trouble' strict mode causes is to ensure you declare your variables, preventing typoed variables from being accepted without complaint, for example. As to $/, reading in perlvar documents its behaviour when set to a reference to an integer, or variable containing an integer 'will attempt to read records instead of lines, with the maximum record size being the referenced integer. '. As to read and the like, there are a lot less cumbersome versions in IO::Handle which lets you treat filehandles like objects with methods.
Re: Question about binary file I/O
by james2vegas (Chaplain) on Feb 06, 2011 at 00:37 UTC
    perhaps something like this:
    $i = 0; $TileData = 0x0000; while(!(eof(IN))) { $BytesRead = read(IN, $TileData, 2); unless($BytesRead != 2) { $NewTileData = (($TileData & pack('S*', 0xFC7F)) | (pack('S*', +$GFXFileBits))); print OUT $NewTileData; $i++; next; } }

    you might want to use something other than 'S' (unsigned short) for pack, depending on the format of your data, look at pack and perlpacktut for details.
      Hm...so there's no way to do that without the seek function? This has come up again; here, I need to write characters to a certain position. There isn't some writetowhatever(FILEHANDLE, $variable, NUMCHARS, $offset) function or anything that would make that possible with just one function instead of two or more?
        Check perlfunc if you want. But the standard is to seek and then to write or syswrite. (The difference being whether you want to buffer your requests to write. Usually you do. Really.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://886460]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-25 14:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found