Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

I've muddled my bit and byte formats.

by murrayn (Sexton)
on Oct 09, 2018 at 02:10 UTC ( [id://1223702]=perlquestion: print w/replies, xml ) Need Help??

murrayn has asked for the wisdom of the Perl Monks concerning the following question:

I have a need to read a two dimensional matrix of BITS one column at a time. (For the curious, it's a TIFF file but that's not important.)

The data is stored in a binary "chunk" with parameters telling me the physical storage represents Y rows of X BYTES each. I read the chunk into a Perl array with each element containing one row of the file. As a result, "$binimage[$row]" contains a string of bits as wide as my image - one per pixel.

To transform this data into a required format I actually need to read columns of BITS in reverse order - I want row 5/bit 0, row 4/bit 0, … row 0/bit 0 … row 5/bit 1 … row 0/bit 1 … row 0/bit 7 etc., until the end of the file. (Let's not go into why!)

Snippets of code and results...
(some variables, $nextrow, $col, $subcol, are set in outer loops not shown)

for my $index (0..5) { my $thisrow = $nextrow - $index; print "processing row/column/bit $thisrow/$col/$subcol: " . substr +($binimage[$thisrow], $col, 1) . "\n"; printf ("Incoming byte is %d/%x/%8b/%s. Test bitmask is %d/%x/%8b +/%s.\n", substr($binimage[$thisrow], $col, 1),substr($binimage[$thisrow], $ +col, 1),substr($binimage[$thisrow], $col, 1), unpack('B8',substr($binimage[$thisrow], $col, 1)), 2**(7-$subcol), 2**(7-$subcol), 2**(7-$subcol), unpack('B8', 2**(7-$subcol))); substr ($char, $index + 2, 1) = ((substr($binimage[$thisrow], $col +, 1)) & 2**(7-$subcol))? "1" : "0"; # Update tracker where a "1" is +found! }
Generates this output
processing row/column/bit 5/0/0: (unprintable square box symbol) Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. processing row/column/bit 4/0/0: (unprintable square box symbol) Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. processing row/column/bit 3/0/0: (unprintable square box symbol) Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. processing row/column/bit 2/0/0: (unprintable square box symbol) Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. processing row/column/bit 1/0/0: (unprintable square box symbol) Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. processing row/column/bit 0/0/0: ? Incoming byte is 0/0/ 0/10010000. Test bitmask is 128/80/100000 +00/00110001. ...

I know from my hex editor that the first BYTE of rows 5..1 contain 0b00000000 and the first BYTE of row 0 contains 0b10010000 which matches unpack('B8', …).

I expect that my bitwise & of 0b10010000 with 0x80 should produce 0x80 which will be "TRUE" which should tell me that the first bit of the byte is set.

I expect that, similarly, bitwise & of 0b00000000 with 0x80 will be 0 indicating that the first bit of the byte is not set.


Why is this demonstrably not so?

Why does my unpack of 0x80 generate the pattern "00110001" when unpack of 0x00 from the file generates "00000000"?

Replies are listed 'Best First'.
Re: I've muddled my bit and byte formats.
by ikegami (Patriarch) on Oct 09, 2018 at 02:57 UTC

    First of all, please provide something runnable in your future questions instead of forcing us to do that work.

    use strict; use warnings qw( all ); my @binimage = ( "\x00", "\x00", "\x00", "\x00", "\x00", "\x90" ); my $nextrow = 5; my $col = 0; my $subcol = 0; for my $index (0..5) { my $thisrow = $nextrow - $index; print "processing row/column/bit $thisrow/$col/$subcol: " . substr +($binimage[$thisrow], $col, 1) . "\n"; printf ("Incoming byte is %d/%x/%8b/%s. Test bitmask is %d/%x/%8b +/%s.\n", substr($binimage[$thisrow], $col, 1), substr($binimage[$thisrow], $col, 1), substr($binimage[$thisrow], $col, 1), unpack('B8',substr($binimage[$thisrow], $col, 1)), 2**(7-$subcol), 2**(7-$subcol), 2**(7-$subcol), unpack('B8', 2**(7-$subcol))); printf "%s\n", ((substr($binimage[$thisrow], $col, 1)) & 2**(7-$su +bcol))? "1" : "0"; }

    Secondly, ALWAYS USE use strict; use warnings qw( all );!!! This is the output of the above program:

    printf (...) interpreted as function at a.pl line 13. processing row/column/bit 5/0/0: ▒ Argument "M-^P" isn't numeric in printf at a.pl line 13. Argument "M-^P" isn't numeric in printf at a.pl line 13. Argument "M-^P" isn't numeric in printf at a.pl line 13. Incoming byte is 0/0/ 0/10010000. Test bitmask is 128/80/100000 +00/00110001. Argument "M-^P" isn't numeric in bitwise and (&) at a.pl line 23. 0 processing row/column/bit 4/0/0: Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. Argument "\0" isn't numeric in bitwise and (&) at a.pl line 23. 0 processing row/column/bit 3/0/0: Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. Argument "\0" isn't numeric in bitwise and (&) at a.pl line 23. 0 processing row/column/bit 2/0/0: Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. Argument "\0" isn't numeric in bitwise and (&) at a.pl line 23. 0 processing row/column/bit 1/0/0: Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. Argument "\0" isn't numeric in bitwise and (&) at a.pl line 23. 0 processing row/column/bit 0/0/0: Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Argument "\0" isn't numeric in printf at a.pl line 13. Incoming byte is 0/0/ 0/00000000. Test bitmask is 128/80/100000 +00/00110001. Argument "\0" isn't numeric in bitwise and (&) at a.pl line 23. 0

    You are repeatedly using the string resulting from "\x90" as a number, but it's not.

    You are repeatedly using the string resulting from "\x00" as a number, but it's not.

    Basically, everywhere you have

    substr($binimage[$thisrow], $col, 1)
    you should have
    ord(substr($binimage[$thisrow], $col, 1))

    (Inside unpack 'B8' is the exception. That one does expect a string.)

      Thank you!

      The "ord" function was, as you stated, the key to making this work. My immediate need is resolved.

      Clearly there's a subtlety to substr of which I am unaware. $binimage[$thisrow] is just a sequence of storage locations containing bits representing pixels. I, the human being, don't care what they represent; I simply want to know whether the bit I'm looking at contains 1 or 0 and the only way I know how to address that is to grab the byte containing the bit and then try to look inside that.

      You appear to be saying that substr will return the ASCII value represented by the number contained in the selected byte(s) rather than just a sequence of 8 bits I, the computer program, can manipulate how I choose.

        Note that

        "\x01" & 1

        equals 0, not 1. See perldoc perlop "Bitwise String Operators" for why.

        You appear to be saying that substr will return the ASCII value represented by the number contained in the selected byte(s) rather than just a sequence of 8 bits I, the computer program, can manipulate how I choose.

        substr knows nothing about ASCII. substr returns a portion of the original string

        the only way I know how to address that is to grab the byte containing the bit and then try to look inside that.

        That is what you must do, But you didn't do that. You still had a string.

        In most current programming languages, a "string" is a "string of bytes" or rather "string of characters", not a "string of bits".
        The latter would be explicitly called "bitstring" (or "string of bits", for that matter).
Re: I've muddled my bit and byte formats.
by tybalt89 (Monsignor) on Oct 09, 2018 at 04:05 UTC

    Just from reading your description, my (WA)guess is you're looking for something like this:

    #!/usr/bin/perl # https://perlmonks.org/?node_id=1223702 use strict; use warnings; my @binimage = ( "\x00", "\x00", "\x00", "\x00", "\x00", "\x90" ); my $matrix = join "\n", map({ unpack 'B*', $_ } reverse @binimage), '' +; my $transform = ''; # transpose matrix $transform .= "\n" while $matrix =~ s/^./ $transform .= $&; '' /gem; print "$transform\n"; # debug print to validate transpose my $output = pack 'B*', $transform =~ tr/\n//dr; # remove \n, convert +to bits printf "output = %v08B\n", $output;

    Outputs

    100000 000000 000000 100000 000000 000000 000000 000000 output = 10000000.00000000.00100000.00000000.00000000.00000000
Re: I've muddled my bit and byte formats.
by BillKSmith (Monsignor) on Oct 10, 2018 at 15:16 UTC
    Your comment about TIFF inspired me to google that format. I found a warning that software to decode this format may require a license. Did you check out the built-in function 'vec'? The bits in your string are probably not in the order that you expect. You should be able to fix this with unpack. (pack/unpack are confusing, and working code probably cannot be ported to other systems).
    Bill
      I found a warning that software to decode this format may require a license.

      Please share the URL of this warning. It would be a public good to debunk it openly.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1223702]
Approved by Athanasius
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-25 14:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found