http://qs321.pair.com?node_id=11101569

johngg has asked for the wisdom of the Perl Monks concerning the following question:

When writing the script posted in this reply I originally tried to use unpack to extract each line after sorting but it mashed the output into a single line with no line feeds. (I went with substr instead). The documentation states that unpack does the reverse of pack but using the 'A' template seems to lose trailing newline characters when unpacking. Embedded newlines are preserved. Using the 'a' template instead works as expected.

use strict; use warnings; use feature qw{ say }; use List::Util qw{ max }; my $string = qq{abc\n}; my $packed = pack q{A*}, $string; my $unpacked = unpack q{A*}, $packed; say $string eq $packed ? q{OK - original and packed are the same} : q{Not OK - original and packed differ}; sideBySide( $string, $packed ); say $string eq $unpacked ? q{OK - original and unpacked are the same} : q{Not OK - original and unpacked differ}; sideBySide( $string, $unpacked ); say q{=} x 50; $string .= qq{def\n}; $packed = pack q{A*}, $string; $unpacked = unpack q{A*}, $packed; say $string eq $packed ? q{OK - original and packed are the same} : q{Not OK - original and packed differ}; sideBySide( $string, $packed ); say $string eq $unpacked ? q{OK - original and unpacked are the same} : q{Not OK - original and unpacked differ}; sideBySide( $string, $unpacked ); sub sideBySide { my( $original, $modified ) = @_; my @origChars = map { sprintf q{%#02x}, ord } split m{}, $original +; my @modChars = map { sprintf q{%#02x}, ord } split m{}, $modified +; my $nRows = max scalar( @origChars ), scalar( @modChars ); for ( 1 .. $nRows ) { printf qq{%8s%8s\n}, scalar @origChars ? shift @origChars : q{}, scalar @modChars ? shift @modChars : q{}; } }

The output.

OK - original and packed are the same 0x61 0x61 0x62 0x62 0x63 0x63 0xa 0xa Not OK - original and unpacked differ 0x61 0x61 0x62 0x62 0x63 0x63 0xa ================================================== OK - original and packed are the same 0x61 0x61 0x62 0x62 0x63 0x63 0xa 0xa 0x64 0x64 0x65 0x65 0x66 0x66 0xa 0xa Not OK - original and unpacked differ 0x61 0x61 0x62 0x62 0x63 0x63 0xa 0xa 0x64 0x64 0x65 0x65 0x66 0x66 0xa

My questions: is this a bugette, a feature, or am I having a senior moment?

Cheers,

JohnGG

Replies are listed 'Best First'.
Re: Problem with pack/unpack asymmetry
by Eily (Monsignor) on Jun 19, 2019 at 14:53 UTC

    The pack doc says:

    When unpacking, A strips trailing whitespace and nulls, Z strips everything after the first null, and a returns data with no stripping at all.
    Whitespace seems to be the same thing as \s in regexes.
    >perl -MData::Dump=pp -E "pp unpack 'A*', qq<aaaa\t\n >" "aaaa" >perl -MData::Dump=pp -E "pp unpack 'a*', qq<aaaa\t\n >" "aaaa\t\n " >perl -MData::Dump=pp -E "pp unpack 'a*', qq<aaaa\t\n \0>" "aaaa\t\n \0"
    (The last one is because \0 is the padding for 'a' patterns, so I wanted to be sure it wasn't removed)

      A sad case of RTFM then! What's even worse, now that you've brought it to my attention I vaguely remember reading that passage some years ago :-(

      Thank you, shmem and Eily, for your replies.

      Cheers,

      JohnGG

        A sad case of RTFM then!

        See my signature ;-)

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Problem with pack/unpack asymmetry
by shmem (Chancellor) on Jun 19, 2019 at 14:49 UTC
    My questions: is this a bugette, a feature, or am I having a senior moment?

    Most likely all of them. A is ASCII, and probably doesn't honour control chars at end of string, i.e. the C string terminator \0 - bug?.

    update: Try with my $string   = qq{abc\cL};

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'