Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Losing Bits with Pack/Unpack

by BillKSmith (Monsignor)
on Sep 18, 2020 at 15:27 UTC ( #11121926=note: print w/replies, xml ) Need Help??


in reply to Losing Bits with Pack/Unpack

If I ignore your text and look at your code, it appears you are trying to reformat each 2.5 8-bit characters as one 20-bit code point in order to reduce the number of 'characters' in your string. This would be valid if all 20-bit code points were valid. Your example does work when you get the details right. Your twelve character string is stored as a buffer of five code points.
use strict; use warnings; my $text='Hello World!'; my $hex_text = unpack 'H*', $text; my @code_points; while ($hex_text) { my $hex_num = substr($hex_text, 0, 5, ''); push @code_points, hex(sprintf '%05s',$hex_num); } my $buffer = pack '(U)*', @code_points; my @_code_points = unpack('(U)*', $buffer); my $_hex_text = sprintf '%X' x scalar(@_code_points), @_code_points; my $_text = pack 'H*', $_hex_text; print $_text;

UPDATE - Added improved code (with testing)

use strict; use warnings; use Encode qw(decode); use Test::More tests=>2; my $text='Hello World!'; my $buffer = pack '(U)*', # Convert to Unicode map {hex($_)} # Convert to decimal unpack '(a5)*', # Groups of 5 unpack 'H*', $text; # Convert to hex my $num_uni_chars = length(decode('UTF-8', $buffer)); is( $num_uni_chars, int(length($text)/2.5 + .5), 'Number of Unicode characters'); my $_text = pack 'H*', # Convert pairs of hex to ascii sprintf '%X' x $num_uni_chars, # Convert to hex and join unpack('(U)*', $buffer); # Decimal code points is($_text, $text, 'Restored text');

OUTPUT:

1..2 ok 1 - Number of Unicode characters ok 2 - Restored text
Bill

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11121926]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2022-01-16 09:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (49 votes). Check out past polls.

    Notices?