Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: Losing Bits with Pack/Unpack

by afoken (Canon)
on Sep 17, 2020 at 07:14 UTC ( #11121865=note: print w/replies, xml ) Need Help??

in reply to Losing Bits with Pack/Unpack

I believe I can squeeze 3 8-bit ascii characters into a single 20-bit unicode character

No, you can't.

  • ASCII is 7 bit, not 8 bit.
  • Unicode defines code points from 0 to 0x10FFFF, i.e. 0x110000 code points. You need at least 21 bit for that (ln2(0x110000) = 20.087...), not 20 bit. Depending on the selected Unicode Transformation Format, you need up to 32 bit to encode those code points (see UTF-8 and UTF-16). Especially note that not all 32-bit combinations are valid Unicode.
  • Three 7-bit characters need 21 bits, not 20 bits.
  • Three 8-bit characters need 24 bits, not 20 bits.

If you want to store more bits in a limited storage area than that storage area allows, you need compression, either lossy or lossless. Just shifting bits around won't help.


Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11121865]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2022-01-17 17:31 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (51 votes). Check out past polls.