http://qs321.pair.com?node_id=11141549

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,
let's set so double the killer delete select all.

Sorry. I'll try that again.

I am parsing binary data with unpack. The binary data consists of segments of six bytes. There are hundreds of such six-byte datums stored in a single binary string.

The binary data looks something like this:

my $data = pack("H*", "0000230ebb0000002b0ece000000330ee200");

I can do unpack("(CCSS)*", $data); and it will output an array like: [0,0,3619,187,0,0,3627,206,0,0,3635,226]

However, I would like it to output an array like: [[0,0,3619,187],[0,0,3627,206],[0,0,3635,226]]

i.e. so that each of the six-byte segments form an arrayref of their own.

I read the venerable node 539664 but was none the wiser.

Can unpack() do this or do I need to just loop over the string with substr() and unpack one segment at a time?

I hope this question was clear enough; it's been a long time since I've asked anything.

Replies are listed 'Best First'.
Re: unpack into arrayrefs?
by salva (Canon) on Feb 22, 2022 at 16:25 UTC
    I don't think there is a way to do that with unpack in just one pass... but...
    my @a = map [unpack "CCSS", $_], unpack "(a6)*", $data;

      That is a nice approach.

      I was in a hurry, so went for a quick substr approach:

      push @a, [ unpack "CCSS", substr($data, $_ * 6, 6) ] for 0..(length($data) / 6 - 1);

      Had an off-by-one error at first though. Perl doesn't seem to have a range operator that goes up to n-1, like Ruby does? (0..5 loops from 0 to 5 but 0...5 loops from 0 to 4 in Ruby)

        Perl is smart enough not to need one:

        push @a, [ unpack 'CCSS', substr $data, 0, 6, '' ] while length $data;

        🦛

Re: unpack into arrayrefs?
by Tux (Canon) on Feb 22, 2022 at 16:54 UTC

    If the data shown in your OT is what you expect, make sure you get what you expect (S< or v instead of S), as that would otherwise fail on big-endian machines.

    my $data = pack "H*" => "0000230ebb0000002b0ece000000330ee200"; my $f += [ map [ unpack "(CCS<S<)" ] => unpack "(A6)*" => $data ];'

    or

    my $data = pack "H*" => "0000230ebb0000002b0ece000000330ee200"; my $f += [ map [ unpack "(CCvv)" ] => unpack "(A6)*" => $data ];'

    See this overview for what it means.


    Enjoy, Have FUN! H.Merijn

      That should be a, not A. A will remove trailing 0x20 bytes.


      S might be correct even on big-endian machines.

      • Use S for native endianness.
      • Use S< or v for little-endian.
      • Use S> or n for big-endian.

      If the OP was using a little-endian machine, we can only rule out the last one from the info provided.

      See Mini-Tutorial: Formats for Packing and Unpacking Numbers for a convenient table of pack/unpack numeric formats.

Re: unpack into arrayrefs?
by Fletch (Bishop) on Feb 22, 2022 at 16:23 UTC

    Perhaps List::MoreUtils natatime?

    use 5.034; use List::MoreUtils qw( natatime ); use YAML::XS qw( Dump ); my $data = pack("H*", "0000230ebb0000002b0ece000000330ee200"); my @vals = unpack("(CCSS)*", $data); my $it = natatime 4, @vals; my @grouped; while( my @cur = $it->() ) { push @grouped, [ @cur ]; } say Dump( \@grouped ); __END__ $ perl fooble.plx --- - - 0 - 0 - 3619 - 187 - - 0 - 0 - 3627 - 206 - - 0 - 0 - 3635 - 226

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.