Bytes always consist of 0's and 1's. ;-)
Frankly speaking, I am not sure I understand you here. Let me rephrase: For each block of n bytes, you are going to replace it by a number of m bytes; m>n. Your input data and output data are files consisting of 0's and 1's. I don't understand why, but I accept that. Is that correct?
If yes, you can perform any mathematical operations with any of the following three representations of the data:
- A @list of bits, e.g. @list = (0, 1, 0, 0, 0, 0, 0, 1); # 'A'; this is what you are using.
- A $bitstring, e.g. $bitstring = '01000001'; # 'A'.
- Binary data, e.g. $data = 'A'; (that is, read directly from the file using e.g. $data = <file>). Obviously, this representation uses the least amount of space. This is not really a compression (for my definition of compression), it is just the 'natural' representation of the data. On the contrary, the other two representations are (probably unnecessary) expansions.
These three representations are equivalent; you just need to use different syntax to access them. For example, to access the third bit in the data, you would use
- $third_bit = @list[2];
- $third_bit = substr($bitstring, 2, 1);
- $third_bit = vec($data, 5, 1); (this one is a bit more tricky, see the documentation for vec)
To access whole bytes or blocks of bytes, you would use splice, substr and substr, respectively. All the operations you will need to perform can be expressed in all three data representations -- but the last one will only use 2M of memory... Plus, for the last one, you can use perl's binary or, and etc, whereas for the @list and $bitstring, you'll have to emulate the mathematical functions (using the abovementioned substr, vec etc.)
| [reply] [Watch: Dir/Any] [d/l] [select] |
You said:
I am working on an encryption system where each ASCII
char is assigned a certain number of bits, so for example,
if the text to be encrypted is 1000 bytes, then after
encryption that text will be converted
to 36000 bytes consisting of just 0's and 1's.
Is it the case that the encryption system requires access
to the entire data stream in order to work
at all? If encrypting, say, 10 sets of 100 bytes (producing
10 sets of 3600 bytes) works as
well as cranking a lump of 1000 bytes into 36000,
then you should just read,
process and output a small portion of data at a time, rather
than trying to hold an entire file -- with massive amounts
of wasted bits -- in memory at one time.
Apart from that -- I'm sorry but... -- if memory
consumption is an issue, and forcing some particular method
of bit padding is a requirement, I'd use C rather than Perl.
update: Maybe what you want is sysread,
to bring a stated number of bytes into an input scalar
variable; e.g.:
while ( $n_bytes_read = sysread( FILE, $inpbuf, 32 ) > 0 )
{
if ( $n < 32 ) { # must be the last chunk
# ... maybe this needs special treatment
}
process_input_bytes( $inpbuf );
}
| [reply] [Watch: Dir/Any] [d/l] |