Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Another try at explaining this:

When dealing with 7-bit ascci, the Uppercase begins at 65 and the lowercase at 97 -- 32 higher.   Since 32 is a power of two represented by bit 5 of the character, if this bit is set, the letter is lc, if unset, Uc.
$ perl -lwe'$,=$\;print unpack("B*","A"), unpack("B*","a"), unpack"B*" +,"A"^"a"' 01000001 <- "A": 64 + 1 01100001 <- "a": 64 + 32 + 1 00100000 <- result of XORing
The bit will be set only if the original was uppercase.   Since XORing something with itself is always 0, that is the only bit which can be set.   The lc of the replacement will have that bit set because that's what makes it lc, with other bits set to determine which letter.  

So, bit 5 is set in the XORing of the original with its lc self only if the original is Uc (the opposite of the bits meaning!) and set in the lc replacement.   If they are both set XOR clears the result: hence Uc; if only the replacement is set it leaves it: lc.

I think at this point I should exclaim "QED" and run.   It seemed clear enough before I started trying to explain it in this little box!

update:   But note that jryan's answer above will work with any locale !

reupdate;   IO points out (and I should've checked) that capitalizing-by-resetting-bit-5 also works for the 8-bit characters in the standard ISO8859-1 ("latin-1") character set.

  p

In reply to Re: Re: Case-preserving substitutions by petral
in thread Case-preserving substitutions by wasii

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-16 20:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found