http://qs321.pair.com?node_id=241332


in reply to what's faster than .=

Perl strings are allocated using the malloc() C library routine. Perl strings are 'grown' using the realloc() C library routine. An append operation is a realloc() operation (to 'grow' the string) followed by a memcpy() to copy the additional data at the end (in your case, two bytes, the character, and a '\0' character that is kept at the end of every Perl string).

The malloc()/realloc() C library functions have platform specific restrictions that define minimal space allocation increments. For example, most platforms require data structures to be allocated on a 4 or 8 byte boundary. Since malloc() can be used to allocate any type including larger types such as 'double', malloc() will always round the size up to at *least* the next 4 or 8 byte boundary. This means that even if Perl allocated a string using malloc(1), malloc() would return a 4 or 8 byte memory area. Calls to realloc(2), or realloc(3) would end up being no-op operations as all of 1, 2, and 3 result in a minimal size of 4 or 8 bytes.

The malloc() and realloc() routines are usually also optimized to hold larger data structures. One of the more straight-forward implementations is for malloc() or realloc() to round allocation sizes up to the next power of 2. Therefore malloc(5) would return an 8 byte memory area, malloc(9) would return a 16 byte memory area, and malloc(17) would return a 32 byte memory area. realloc() will only need to copy bytes if the memory area cannot be enlarged to the next power of 2 without moving it.

For memory areas larger than one page of memory, it is not uncommon for malloc() and realloc() to be implemented using mmap() and mremap() (mremap() exists on at least modern versions of Linux). If this is the case, realloc() can increase the size of a 2 page (8192 bytes?) area to 4 pages, without copying the data, even if the page after the two pages is in use. The mremap() call can be implemented to re-address virtual pages meaning that although 128 Kbytes of data may appear to be 'moved' in virtual memory from one address to another, no memory copy ever needs to take place.

All of these details are highly implementation specific. The cost of a string append operation in Perl is entirely dependent on the efficiency of the realloc() C library routine that Perl was compiled to use. In general, the implementations are quite efficient. If you really care to understand the cost in terms of real numbers, play with the Benchmark module ('perldoc Benchmark') and do your own timings.