Codepoint U+00A4 --> hex 0xA4 --> binary 10100100 We need to store 10100100 in the UTF-8 bytes: 110..... 10..... We distribute 10100100 over the 'points' in the two bytes: 110 00010 10 100100 So U+00A4 in UTF-8 becomes 1100010 10100100 or 0xc2 0xa4.