http://qs321.pair.com?node_id=224686


in reply to Pack/Unpack Tutorial (aka How the System Stores Data)

I have a few comments and I'll just leave them here in no particular order:

  1. I wish you would have defined "word" prior to using it willy-nilly. It's jargon that your tutorial's audience it's likely to be familiar with. From WordNet: a word is a string of bits stored in computer memory; "large computers use words up to 64 bits long".

  2. Bytes are almost always eight bits though that's not a universal constant. Perhaps it's infrequent enough that I didn't even need to mention but this one always gets my goat.

  3. Your use of "most" and "least" significant byte was also jargon. If you assume the value 0x12345678 then the most significant byte has the value 0x12 and the least significant has the value 0x78. From there the point on differently endian machines is just which order you start with when transcribing bytes.

  4. Your use of memory addresses is obfuscatory. This is better written as "Byte 0, byte 1, byte 2, byte 3". The only point at which a perl programmer cares about memory addresses is when doing non-perl programming or with the 'p' or 'P' format options. The point here is to indicate an order to the bytes in memory - that byte 0 might be located at a memory address 1000 is entire beside the point.

  5. White space is allowed without consequence in an unpack/pack format. It's just ignored except when it's a fatal error. I haven't nailed it down but some uses of whitespace just don't parse. That may be a bug but it's worth noting. This just means that in general people should use whitespace in a format to enhance readability - it doesn't affect it's operation.

  6. I've never been clear on the bit order within a byte - can you expand on that? I used to think that the differently endian machines also shuffled the bit order around as well. At this point I'm just confused.