Thanks for that. According to is_utf8() the string is in UTF8, however running encode_utf8() doesn't resolve the problem. it *does* remove then 4 character hex, but doesn't put the code back to what it was originally:
encode_utf8() version: 00000000 50 4B 03 04 14 00 09 00 08 00 67 EF B
+F BD 25 46 PK........g...%F
Original version: 0000000 50 4b 03 04 14 00 09 00 08 00 67 8d 2
+5 46 00 00
I took a look at the attributes of the file, as @Veltro suggested and got the following:
content_type is: application/octet-stream
content_encoding is: none
file_name is: screenshot-172 21 242 64.zip
headers is: Content-Type: application/octet-stream; name="screenshot-1
+72 21 242 64.zip"
Content-Disposition: attachment; filename="screenshot-172 21 242 64.zi
+p"
Content-Transfer-Encoding: base64
Content-Length: 460749
That "base64" string in the headers section looked interesting although the string does not seem to be encoded insofar as is has characters in it that do not match the Base64 character set (A-Za-z0-9+/=).
I tried encoding and decoding using the MIME functions but to no avail.
The content length stated is the exact size of the actual binary file (460749 bytes) but the string provided by the RT libraries is different (442958 bytes). I would be willing to believe that the missing 17791 characters are included in the wide characters in the RT string, that is to say that I expect there to be 17791 wide characters in the octet stream.
|