Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Re^2: Losing Bits with Pack/Unpack

by ikegami (Patriarch)
on Sep 18, 2020 at 08:54 UTC ( #11121900=note: print w/replies, xml ) Need Help??

in reply to Re: Losing Bits with Pack/Unpack
in thread Losing Bits with Pack/Unpack

ASCII has 30 control characters and four whitespace characters (SPACE, TAB, CR and LF). If you forgo support for control characters, TAB and CR (but keep space and LF), you end up with 0x60 characters. This isn't a power of 2 (which would help make things simple and very efficient), but it's still a nice number (3/4 of 2^7).

That would require an address space of 0x60^3 = 884,736 (0xD_8000) code points. That's a fair bit smaller than the 1,114,112 (0x11_0000) code points Unicode supports.

Of those, some are best avoided. I would avoid at least the following:

  • High surrogates (1024)
  • Lo surrogates (1024)
  • Non-characters (66)
  • U+FFFD
  • Control characters, which includes U+FEFF (226)

That's only 2,341 and we have a buffer of 229,376. Golden!

Mapping the 3 ASCII characters (with the limitations mentioned above) unto only "safe" characters won't be nice and easy, but it is doable.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11121900]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (8)
As of 2022-01-17 11:20 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (51 votes). Check out past polls.