perltutorial
jdporter
<h1>Arrays: A Tutorial/Reference</h1>
<p>
<i>Array</i> is a type of Perl variable.
An array variable is an ordered collection of any number (zero or more) of <i>elements</i>.
Each element in an array has an [wp://Index (information technology)#Array_element_identifier|index] which is a non-negative integer.
Perl arrays are (like nearly everything else in the language) dynamic:<ol>
<li> they grow as necessary, without any need for explicit memory management; </li>
<li> they are [wp://heterogeneous], or generic, which is to say, an array doesn't know or enforce the type of its elements. </li>
</ol>
</p><p>
Actually, the values of Perl array elements can only be [id://591875|<i>scalars</i>].
This may sound like a limitation, if you think of scalars only as comprising numbers and strings;
but a scalar can also be a [doc://perlreftut|reference] to any perl data variable type – scalar, array, [id://591877|hash], etc.
Therefore, by storing a reference to any data type in an array's elements, [doc://perllol|arbitrarily complex] [doc://perldsc|data structures] are possible. <!-- <small>(This is roughly comparable to having an array of void pointers in [wp://C (programming language)|C] — except, of course, that Perl references are "smart": they know what they refer to.)</small> -->
Other scalar types, such as [doc://perlfaq5#How-can-I-make-a-filehandle-local-to-a-subroutine?--How-do-I-pass-filehandles-between-subroutines?--How-do-I-make-an-array-of-filehandles?-filehandle,-local-filehandle,-passing-filehandle,-reference|filehandles ] and the special [doc://undef] value, are also naturally allowed in the elements
of an array.
</p><p>
So, given those characteristics of a Perl array, what kinds of things would you want to do with it?
That is, what <i>operations</i> should be able to act on it?
You might conceive different sets of operations, or <i>interfaces</i>, depending on how you expect to use an array in your program:
<ol>
<li> as a monolithic whole;</li>
<li> as a [wp://Stack (data structure)|stack] or [wp://deque|queue] — that is, only working with its ends;</li>
<li> as a [wp://random access] table of scalars — that is, working with all of its elemental parts.</li>
</ol>
Perl arrays can be used in all those ways, and more.
</p>
<readmore>
<p>
Here are the fundamental Perl array operations:
</p><ul>
<li> Initialize </li>
<li> Clear </li>
<li> Get count of elements </li>
<li> Get the highest index </li>
<li> Get list of element values </li>
<li> Add new elements at the end </li>
<li> Remove an element from the end </li>
<li> Adds new elements at the beginning </li>
<li> Remove an element from the beginning </li>
<li> Access one element at an arbitrary index </li>
<li> Access multiple elements at arbitrary indices </li>
<li> Insert/Delete/Replace items in the middle of an array </li>
</ul>
<p>
This tutorial focuses specifically on the array variable type. There are many things you can do in Perl with lists which will also work on arrays; for example, you can iterate over their contents using [doc://perlsyn#Foreach-Loops-for-foreach|foreach]. Those things are not discussed here.
Also: [doc://perlfaq4#What-is-the-difference-between-a-list-and-an-array?]
</p>
<h3> Initialize an array </h3>
<p>
Simple assignment does the job:
<code>
@array = ( 1, 2, 3 );
@array = function_generating_a_list();
@array = @another_array;
</code>
The key points are that <ol>
<li> the assignment to an array gives [id://738558|list context] to the right hand side; </li>
<li> the right side can be any expression which results in a list of zero or more scalar values. </li>
</ol>
The values are inserted in the array in the same order as they occur in the list,
beginning with array index zero. For example, after executing
<code>
@array = ( 'a', 'b', 'c' );
</code>
element 0 will contain 'a', element 1 will contain 'b', and so on.
</p><p>
Whenever an array is assigned to <i>en masse</i> like this,
any contents it may have had before the assignment are removed!
</p>
<h3> Clear an array </h3>
<p>
Simply assign a zero-length list:
<code>
@array = ();
</code>
Assigning a value such as <c>undef</c>, <c>0</c>, or <c>''</c> <i>will not work!</i>
Rather, it will leave the array containing one element, with that one value. That is,
<code>
@array = 0;
# and
@array = ( 0 );
</code>
are functionally identical.<br/>
Note that if your goal is to assign the one-element list <c>(0)</c> to the array,
omitting the parentheses is considered to be bad style, though technically they are not strictly necessary in this case.
</p>
<h3> Get count of elements </h3>
<p>
To get the "length" or "size" of an array, simply use it in a [id://738558|scalar context].
For example, you can "assign" the array to a scalar variable:
<code>
$count = @array;
</code>
and the scalar variable will afterwards contain the count of elements in the array.
Other scalar contexts work as well:
<code>
print "# Elements: " . @array . "\n";
</code>
(Yes, <tt>[doc://print]</tt> gives its arguments list context, but the dot (string concatenation) operator takes precedence.)
</p><p>
You can always force scalar context on an array by using the function named <tt>[doc://scalar]</tt>:
<code>
print "# Elements: ", scalar(@array), "\n";
</code>
Note that this is a <i>get</i>-only property; you cannot change the length of the array by assigning a scalar to the array variable.
For example, <c>@array=0</c> does not empty the array (as stated in the previous section, <b>Clear an array</b>).
</p>
<h3> Get the highest index </h3>
<p>
Often, you want to know what is the highest index in an array — that is, the index of its last element.
Perl provides a special syntax for obtaining this value:
<code>
$highest_index = $#array;
</code>
This is useful, for example, when you want to create a list of all the indices in an array:
<code>
foreach ( 0 .. $#array ) {
# $_ is set to each index number, in turn, from first (0) to last ($#array)
}
</code>
</p><p>
Unlike <c>scalar(@array)</c>, <c>$#array</c> <i>is</i> a settable property.
When you assign to an array's <c>$#array</c> form, you cause its length (number of elements) to grow or shrink accordingly.
If the length increases, the new elements will be uninitialized (that is, they'll be [doc://undef]).
If the length decreases, elements will be dropped from the end.
(Note, however, that perl dynamically sizes arrays, so forcing the length of an array like this is not something you'd normally need to do.)
</p>
<h3> Clear an array - Round 2 </h3>
<p>
Given that <c>$#array</c> is assignable, you can clear an array by assigning -1 to its <c>$#array</c> form.
(Why -1? Well, that's what you see in <c>$#array</c> if <c>@array</c> is empty.)
Generally, this is not considered good style, but it's acceptable.
</p>
<p>
Another way to clear an array is <c>undef @array</c>.
This technique should be used with caution, because it frees up some memory used internally to hold the elements.
In most cases, this isn't worth the processing time. About the only situation in which you'd want to do this is if @array has a huge number of elements, and @array will be re-used after being cleared but will not hold a huge number of elements again.
</p><p>
Beware: As mentioned above in <b>Clear an array</b>, assigning <c>@array = undef</c> does <i>not</i> clear an array.
Unlike the case with scalars, <c>@a=undef</c> and <c>undef(@a)</c> are not equivalent!
</p>
<h3> Get list of element values </h3>
<p>
To get the entire list of values stored in an array at any given time, simply use it in a list context:
<code>
print "Here are your things: ", @array, "\n";
</code>
This is useful for iterating over the list of values stored in an array, one at a time:
<code>
foreach ( @array ) { ...
</code>
This works because in the <tt>foreach</tt> control construct, the stuff inside the parentheses is expected to be a list — or, more precisely, an expression which will be evaluated in list context and is expected to result in a list of (zero or more) scalar values.
</p>
<p>
<b>Quiz:</b> What's the difference between these two lines of code:
<code>
$x = @array;
@x = @array;
</code>
</p><p>
Answer:
<spoiler><br/>
In the first, the scalar $x is set to the <b>number of elements</b> in @array.<br/>
In the second, the array @x is set to a <b>copy of the contents</b> of @array.<br/>
<br/></spoiler>
</p>
<h3> Remove an element from the end </h3>
<p>
The function to remove a single element from the end of an array is <tt>[doc://pop]</tt>.
Given the code:
<code>
@array = ( 'a', 'b', 'c' );
$x = pop @array;
</code>
<c>$x</c> will contain <c>'c'</c> and <c>@array</c> will be left with two elements, <c>'a'</c> and <c>'b'</c>.
</p><p>
Note: By "end", we mean the end of the array with the highest index.
</p>
<h3> Add new elements at the end </h3>
<p>
Use the <tt>[doc://push]</tt> function to add a number of (scalar) values to the end of an array:
<code>
push @array, 8, 10 .. 15;
</code>
</p>
<h3> Remove an element from the beginning </h3>
<p>
The <tt>[doc://shift]</tt> function removes one value from the beginning of the array.
That is, it removes (and returns) the value in element zero, and shifts all the rest of the elements down one, with the effect that the number of elements is decreased by one.
Given the code:
<code>
@array = ( 'a', 'b', 'c' );
$x = shift @array;
</code>
<c>$x</c> will contain <c>'a'</c> and <c>@array</c> will be left with two elements, <c>'b'</c> and <c>'c'</c>.
(You can see that <tt>[doc://shift]</tt> is just like <tt>[doc://pop]</tt>, but acts on the other end of the array.)
</p>
<h3> Add new elements at the beginning </h3>
<p>
In a similarly analogous way, <tt>[doc://unshift]</tt> acts on the beginning of the array as <tt>[doc://push]</tt> acts on the end.
Given:
<code>
@array = ( 1, 2 );
unshift @array, 'y', 'z';
</code>
<c>@array</c> will contain <c>( 'y', 'z', 1, 2 )</c>
</p>
<h3> Access one element at an arbitrary index </h3>
<p>
The first element of an array is accessed at index 0:
<code>
$first_elem = $array[0];
</code>
Why the <c>$</c> sigil? Remember that the elements of an array can only be scalar values.
The <c>$</c> makes sense here because we are accessing a single, scalar element out of the array.
The thing inside the square brackets does not have to be an [id://943|integer literal]; it can be
any expression which results in a number. (If the resulting number is not an integer, it will be
truncated to an integer (that is, rounded toward zero).
</p><p>
Change the value of the last element:
<code>
$array[ $#array ] += 5;
</code>
</p>
<h3> Access multiple elements at arbitrary indices </h3>
<p>
By analogy, if you want to access multiple elements at once, you would use the <c>@</c> sigil instead of the <c>$</c>.
In addition, you would provide a list of index values within the square brackets, rather than just one.
<code>
( $first, $third, $fifth ) = @array[0,2,4];
</code>
<b>Jargon alert:</b> this syntax for accessing multiple elements of an array at once is called an <i>array slice</i>.
</p><p>
Never forget that with an array slice the index expression is a list: it will be evaluated in list context, and can
return any number (including zero) of index numbers. However many numbers are in the list of indices, that's how many
elements will be included in the slice.
</p><p>
Beware, though: an array slice may <i>look</i> like an array, due to the <c>@</c> sigil, but it is not.
For example,
<code>
$n = @array[0..$#array];
</code>
will <i>not</i> yield the number of items in the slice!
</p>
<p>
Set the second, third, and fourth elements in an array:
<code>
@array[1..3] = ( 'x', 'y', 'z' );
</code>
<blockquote>
<table border=1 cellspacing=0><tr><td>
<h2>Sidebar: More about indices</h2>
<p>
We said earlier that array indices are non-negative integers.
While this is strictly true at some level, perl conveniently lets you index elements from the <i>end</i> of the array using negative indices. <tt>-1</tt> refers to the last element, <tt>-2</tt> to the next-to-last element, and so on.
To oversimplify a bit, <tt>-1</tt> acts like an alias for <tt>$#array</tt>... <i>but only in the context of indexing <tt>@array</tt>!</i>
</p>
<p>
So the following are equivalent:
<code>
$array[ -1 ]
$array[ $#array ]
</code>
But beware:
<code>
@array[ 0 .. $#array ]
</code>
<b>can not</b> be written as:
<code>
@array[ 0 .. -1 ]
</code>
because in this situation the -1 is an argument of the [doc://perlop#Range-Operators-operator,-range-range-..-...|<tt>..</tt> range operator], which has no idea what "highest index number" is actually wanted.
</p>
</td></tr></table>
</blockquote>
<h3>Insert/Delete/Replace items in the middle of an array</h3>
<p>
It is possible to insert items into the middle of an array and remove items from the middle of an array.
<!-- (You can think of it like pushing and popping items in the middle of an array; but don't use those words, as they're not technically accurate for this.) -->
The function which enables this is called <tt>[doc://splice]</tt>.
It can insert items anywhere in an array (including the ends), and it can remove (and return) any sub-sequence of items from an array. In fact, it can do both of these at once: remove some sub-sequence of items and put another list of values in their place.
<tt>[doc://splice]</tt> always returns the list of removed values, if any.
</p>
<p>
The second argument of <tt>[doc://splice]</tt> is an array index, and as such, everything we've said about indices applies to it.
</p>
<p>
The [wp://deque|queue]-like array functions could have been implemented in terms of <tt>[doc://splice]</tt>, as follows:
<code>
unshift @a, @b;
# could be written as
splice @a, 0, 0, @b;
</code>
<code>
push @a, @b;
# could be written as
splice @a, $#a+1, 0, @b; # we have to index to a position PAST the end of array!
</code>
<code>
$b = shift @a;
# could be written as
$b = splice @a, 0, 1;
</code>
<code>
$b = pop @a;
# could be written as
$b = splice @a, -1, 1;
</code>
(Beware that in scalar context splice returns the last of the list of values removed;
shift and pop always return the one value removed.)
</p>
<p>
Remove 3 items, beginning with the 3rd:
<code>
@b = splice @a, 2, 3;
</code>
Insert some new values after the 3rd, without deleting any:
<code>
splice @a, 2, 0, @b;
</code>
Replace the 4th and 5th items with three other values:
<code>
splice @a, # array to modify
3, # starting with 4th item
2, # remove (replace) two items
'x', 'y', 'z'; # arbitrary list of new values to insert
</code>
And while we're at it: <b>Clear an array - Round 3:</b>
<code>
@a = ();
# could be written as
splice @a, 0;
</code>
</p>
<h2>Any Questions?</h2>
<p>
The Perl FAQ has a section on [doc://perlfaq4#Data:-Arrays|Arrays].
</p>
<h2>Related Resources</h2>
<p>
<ul>
<li> [id://17890] </li>
<li> [id://90647] </li>
</ul>
</p>
<hr/>
<h3><i>What about <tt>wantarray</tt>?</i></h3>
<p>
Despite its name, [doc://wantarray] has nothing to do with arrays. It is misnamed.
It should have been named something like <tt>detect_context</tt>.
It is used inside subroutines to detect whether the sub is being called in list, scalar, or void [id://738558|context].
It returns true, false, and undef in those cases, respectively.
</p>
<hr/>
<h3>Other possible topics:</h3>
<ul>
<li> [doc://perltie|tie]ing arrays; the [mod://Tie::Array] module </li>
<li> [doc://delete] and how it doesn't work on arrays </li>
<li> [doc://exists] and how it DOES work on arrays </li>
<li> Various related Perl FAQ entries </li>
<li> Array-related modules, such as those in the [cpan://Array::] family </li>
<li> Traps/gotchas, such as deleting from an array while iterating over it </li>
<li> multidimensional arrays </li>
</ul>
</readmore>
<hr/>
<p>
<i>If you have corrections or suggestions for changes to this tutorial, please [/msg] me if possible, rather than posting a reply. Thanks.</i>
</p>