http://qs321.pair.com?node_id=11136191

drsweety has asked for the wisdom of the Perl Monks concerning the following question:

Hello there! I'm looking for a generic solution to split any number into single bytes printed in hex and separated by space, e.g.:
2 => 2
20 => 14
200 => c8 00
2000 => d0 07
20000 => 20 4e
200000 => 40 0d 03 00
... and so on
I've come up with the following code which works like a charm as long as I'm telling it what type of number (char, short, long, quad) it needs to convert:
$ perl -wle "print join ' ', unpack('(H2)*', pack('c', 2));"
02
$ perl -wle "print join ' ', unpack('(H2)*', pack('s', 200));"
c8 00
$ perl -wle "print join ' ', unpack('(H2)*', pack('l', 200000));"
40 0d 03 00
Is there a generic solution where I don't have to specify the data type for pack()?
  • Comment on Split any number into string of 8-bit hex values (=1 byte)

Replies are listed 'Best First'.
Re: Split any number into string of 8-bit hex values (=1 byte) (updated)
by AnomalousMonk (Archbishop) on Aug 29, 2021 at 22:36 UTC

    Another approach:

    Win8 Strawberry 5.8.9.5 (32) Sun 08/29/2021 18:20:29 C:\@Work\Perl\monks >perl -Mstrict -Mwarnings for my $n (qw(2 20 200 2000 20000 200000 2000000)) { print "$n -> "; print join ' ', unpack '(H2)*', pack 'V', $n; print "\n"; } ^Z 2 -> 02 00 00 00 20 -> 14 00 00 00 200 -> c8 00 00 00 2000 -> d0 07 00 00 20000 -> 20 4e 00 00 200000 -> 40 0d 03 00 2000000 -> 80 84 1e 00
    As with LanX's solution, leading | trailing zeros are present, but they are present in some of the OPed examples as well and I can't figure out when you do and don't want them.

    Update: If you're using a 64-bit Perl, you have the 'Q' pack template specifier (update: which uses native endianity by default; see below), so:

    Win8 Strawberry 5.30.3.1 (64) Sun 08/29/2021 20:05:59 C:\@Work\Perl\monks >perl -Mstrict -Mwarnings for my $n (qw(2 20 200 2000 20000 200000 2000000)) { print "$n -> "; # print join ' ', unpack '(H2)*', pack 'Q', $n; # original print join ' ', unpack '(H2)*', pack 'Q<', $n; print "\n"; } ^Z 2 -> 02 00 00 00 00 00 00 00 20 -> 14 00 00 00 00 00 00 00 200 -> c8 00 00 00 00 00 00 00 2000 -> d0 07 00 00 00 00 00 00 20000 -> 20 4e 00 00 00 00 00 00 200000 -> 40 0d 03 00 00 00 00 00 2000000 -> 80 84 1e 00 00 00 00 00
    (but you get a lot more trailing zeroes :). (Update: The 'Q' specifier should be used with a '<' little-endian specifier to force the desired byte ordering. 'Q' alone - as used in my originally posted code - uses native byte ordering, whatever that may be.)


    Give a man a fish:  <%-{-{-{-<

Re: Split any number into string of 8-bit hex values (=1 byte)
by LanX (Saint) on Aug 29, 2021 at 21:38 UTC
    maybe
    DB<20> print join " ", reverse split /(..)/ , sprintf "%06x\n", 2*10* +*$_ for 0..5 02 00 00 14 00 00 c8 00 00 d0 07 00 20 4e 00 40 0d 03 DB<21>

    I'm not sure why you want low bytes first, but please note you had trailing 00s in your example too.

    update

    this will give you the number of necessary bytes

    DB<29> say 1+int (log (2*10**$_)/log 256) for 0..5 1 1 1 2 2 3

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      log(0) is undefined, so I designed a custom function hexbytes which returns a list of 0 padded bytes and special casing '00'.

      Please note how I kept the reverse and join outside for generic flexibility of use.

      NB: you must still take care of negative $n!

      use strict; use warnings; use feature "say"; sub hexbytes { my ($n)=@_; my $nibbles = $n ? int( log($n)/log 256 )+1 : 1 ; # 00 has no log $nibbles *= 2; # 2 nibbles = 1 byte return sprintf( '%0*x', $nibbles, $n ) =~ /(..)/g; } say "$_ => ", join " ", reverse hexbytes($_) for 0,2,20,200,2000,20000 +,200000;

      C:/Strawberry/perl/bin\perl.exe -w d:/tmp/pm/hex_reverse.pl 0 => 00 2 => 02 20 => 14 200 => c8 2000 => d0 07 20000 => 20 4e 200000 => 40 0d 03 Compilation finished at Mon Aug 30 15:57:08

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        Here a more generic solution, the grouping is free now

        (left the reverse out for better testing)

        use strict; use warnings; use feature "say"; sub hexgroups { my ( $n, $group ) = @_; die "Can't handle negative $n" if $n <0; $group //= 2; # default 1 byte = 2 nibbles my $nibbles = $n ? int( log($n)/log 16**$group ) +1 : 1 ; # 00 has no log $nibbles *= $group; # zero padding my @list = sprintf( '%0*x', $nibbles, $n ) =~ /(.{$group})/g; return @list; } for my $n (0, map { 2*"1e$_"} 0..10) { say "$n => "; say "\t"x2, join " ", hexgroups($n, $_) for 1,2,3,4; }

Re: Split any number into string of 8-bit hex values (=1 byte)
by tybalt89 (Monsignor) on Aug 30, 2021 at 09:06 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11136191 use warnings; for my $n (0, 2, 20, 200, 2000, 20000, 200000) { my @bytes = reverse sprintf('%016X', $n) =~ s/^(?:00)*\B//r =~ /../g +; print "$n => @bytes\n"; }

    Outputs:

    0 => 00 2 => 02 20 => 14 200 => C8 2000 => D0 07 20000 => 20 4E 200000 => 40 0D 03
      Your regex code is impressive to say the least!
      It replicates my output verbatim.

      But this fuzzy requirement about the number of bytes to be output seems odd to me.
      I updated my post with what I hope is a clear question to the OP.

      I guess we shall see what, if anything develops from that.

        I do have another regex ready in case the OP wants answers that are ONLY 1, 2, 4, or 8 bytes long.

        We'll just have to wait...

      Whenever I begin to think that I am getting anywhere in Perl, I look at one of your elegant regexps and realise just how far I still have to go!

Re: Split any number into string of 8-bit hex values (=1 byte)
by Marshall (Canon) on Aug 30, 2021 at 07:50 UTC
    Here is another approach for you.
    I considered the extra zeroes in for example, "200 => c8 00" to be a bug and I did not attempt to replicate this behavior.
    So, here you go, each test number is shown with least significant byte first and also with most significant byte first.

    use strict; use warnings; foreach my $num (2, 20, 200, 2000, 20000, 200000) { my $hex_text = sprintf "%x", $num; #use %X for ABCDEF instead of ab +cdef # we are printing bytes, not nibbles, num chars needs to be even $hex_text = "0$hex_text" if (length($hex_text) % 2 == 1); #add lead +ing 0 if odd $hex_text =~ s/([0-9a-fA-F][0-9a-fA-F])(?=[0-9a-fA-F])/$1 /g; my $reversed_bytes = join(" ", reverse split(" ", $hex_text)); print "$num => $reversed_bytes => $hex_text\n"; } __END__ Number => LSB first => MSB first 2 => 02 => 02 20 => 14 => 14 200 => c8 => c8 2000 => d0 07 => 07 d0 20000 => 20 4e => 4e 20 200000 => 40 0d 03 => 03 0d 40
    Updated REGEX: instead of a-zA-Z, a-fA-F is adequate
    As an additional thought: if you don't need the MSByte first representation, instead of regex, I suppose the split could do more work. If you can explain an algorithmic rule for adding more zeroes, then code could be modified to do that.

    A question for the OP:
    I am looking at this at 2AM local, but I and all of the other Monks are confused about this:

    2 => 2 20 => 14 200 => c8 00 2000 => d0 07 20000 => 20 4e 200000 => 40 0d 03 00
    You say that you don't want to have to specify (char, short, long, quad) and that is a good idea because these terms (except in most cases, char) are platform and compiler specific. Why does 20 decimal have a single byte output and 200 decimal have a double byte output? I have no idea! My output shows the single byte that is required in both cases. For 200,000 decimal, I show the necessary 3 bytes. 3 bytes is a "weird duck". These thing usually go like: 1,2,4,8,16 bytes etc. I suppose that there could be a rule such that the code only emits values with those quanta of bytes. However, your example of 2 different number of bytes for 2 and 200 is quite odd. Both values can be represented by a single byte. It is pure speculation on my part, but perhaps, if the most significant bit is "one", then add another byte(s) of zeroes to make it clear that this is not a 2's complement negative number? I have no idea what your intent is. What are the rules (if any) for adding additional zero bytes?
Re: Split any number into string of 8-bit hex values (=1 byte)
by drsweety (Novice) on Aug 30, 2021 at 19:20 UTC
    First of all I'd like to thank everyone for their contribution! I have realised, that my example and maybe question is unclear and confusing (e.g. regarding endian-ness), sorry about that! Let me try to rephrase my question: I'm looking to split any value/number consisting of x-amount of bytes into single bytes printed in hex and separated by space. So basically I just would like to print a "long" like 200'000 differently: 00 03 0d 40. I don't want to omit any leading zeros in any way, a "long" with 4 bytes should still consist of 4 separates bytes even if 3 would suffice as in my example with 200'000. My example with 200 is wrong as it can be represented by 1 byte (c8) and doesn't need a leading zero, sorry :-(

    Anyway, I would then use this code to create a function which takes any number of arguments and returns a string consisting of a series of single bytes represented in hex. The only thing I came up with is this:

    #!/usr/bin/perl
    
    sub number2hexString {
    	my $output;
    	my $packTemplate;
    	foreach my $i (@_) {
    		if ($i > 65535) {
    			$packTemplate = 'L>';
    		} elsif ($i > 255) {
    			$packTemplate = 'S>';
    		} else {
    			$packTemplate = 'C';
    		}
    		$output .= join(' ', unpack('(H2)*', pack($packTemplate, $i))).' ';
    	}
    	return $output;
    }
    
    print number2hexString(2,20,200,2000,20000,200000)."\n";
    
    this results in:
    $ ./test.pl
    02 14 c8 07 d0 4e 20 00 03 0d 40
    $
    

    This works. However, I don't need perl to know the type of number it needs to convert. It shouldn't care whether it's a char, short, long, quad, signed or unsigned or whatever. It should just take each variable as a series of bytes and convert it to single bytes represented in hex. Basically I'm looking to replace the if/elsif/else part in the above sub.I'm hoping this clears things up!

    Background: I'm using this to communicate via I2C (a slow serial hardware bus) between a Raspberry and an Arduino. As it is slow I do not want to waste unnecesseray bytes (otherwise sending everything as a quad or long would be an option). And as I'm handling the Arduino side as well I know which datatype I'm expecting based on the register and can then reassemble my series of bytes into single bytes, ints or longs etc. A sample transmission would be: master sends: 02 14 07 d0 which the slave (arduino) interprets as follows:

    • 0x02 == 2 = I2C address of the Arduino.
    • 0x14 == 20 = I2C register (which in this eample expects a short. But it could also be a long or anything else. The Arduino-code will handle it)
    • 0x07 and 0xd0 == 2000 = The value to be written into the I2C register
      > However, I don't need perl to know the type of number it needs to convert.

      that doesn't sound right ...

      If you have the value 2 which has to be put into a long register as 00 00 00 02 how is Perl supposed to guess that 4 bytes are needed and 02 isn't already sufficient?

      But if your transfer protocol was able to handle 02 for a long register, why would anyone need to stuff any leading/trailing 0s into it?

      The straight mathematical way to tell the "width" of a number is using log like already shown. But given your demonstrated code I'd rather suggest you sticking with 3 if-then-else levels inside a sub.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

Re: Split any number into string of 8-bit hex values (=1 byte)
by drsweety (Novice) on Aug 31, 2021 at 18:49 UTC
    Dear all. Thank you for your time and suggestions/questions/etc. I get the feeling that I wasn't able to formulate my question in an understandable and unique way, sorry about that. The suggestions are getting more and more complicated which wasn't what I was hoping for. In the end I just wanted perl to print a variable in a specific way :-) (like an output modifier for a numeric datatype where a "long" or whatever shouldn't be printed as one number but instead as a series of bytes in hex). I think I will stick with my suggestion in https://perlmonks.org/?node_id=11136245 which covers my use case, can be easily adapted from unsigned to signed should the need arise and which consists of code I understand today and probably in 2 years as well :-) Thanks!