Looping Over Arrays, Revisited

© 2012, Martin Rinehart

Prerequisites:

1) JavaScript with Native Objects, Volume III of the five-volume Frontend Engineering series, or a good grounding in basic JavaScript.

2) The Sparse Arrays page (on menu above).

In JavaScript, an array is an object, commonly having property names (strings) that look like small, non-negative integer subscripts. Arrays are constructed using array literals or by calling the Array() constructor:

var array0 = [];
var array1 = ['cat', 'dog', 'mouse'];

// not recommended:
var array2 = new Array();

// not recommended, creates ghosts:
var array3 = new Array(10);

For the remainder of this page we use the word "object" as shorthand for "objects that are not arrays."

JavaScript, born 1995, features two distinct loop types. First is the arithmetic loop, which comes straight from C, 1972 (see Genealogy of Programming Languages):

for ( initializers;
      condition;
      increment action ) {
    /* loop body */
}

// Example:

var len = array.length;
for ( var i = 0; i < len; i++ ) {
    /* act on array[i] */
}

The second type, an enumerated loop, is JavaScript-specific; variants are common in other languages of the late 20th century:

for ( propname in object ) {
    /* loop body */
}

// Example (recommended?):

for ( var i in array ) {
    /* act on array[i] */
}

This leads to the question: is it appropriate to use the enumerated loop (the for/in loop) on arrays?

The Conventional Wisdom

The conventional wisdom has been that the for/in loop should be used for non-array objects while the arithmetic loop should be used for arrays. Here's a common answer from a popular forum in response to a newbie question about using for/in to iterate over an array: "... for iterating over arrays sequential [arithmetic] loops are always recommended."

Of course the word "always" is almost never right, but in this case it is very nearly accurate. Only beginners who don't know any better use a for/in loop to iterate over array values.

Well, make that beginners and me, too. Let's look at this carefully.

Reasons to Prefer for/in Loops

Enumerated loops had become popular by the mid-90s, when JavaScript was first written. Java, another language born in '95, originally featured just the arithmetic loop but later added an enumerated loop. Why are they popular?

No Errors About Ends

How long is that array? Typically, it is array.length elements long, and the last element (assuming the first element is zero and the rest are contiguous) is array[array.length-1]. If you have never looped until one more or one less than the end of an array, you haven't written many loops. This is an invitation to human error.

No Problems with Sparse Arrays¹

The typical arithmetic loop assumes that the array elements are contiguous which is often, but not always, the case. What happens if array[i] has been deleted? Actually, this is harmless if the program recognizes the possibility:

if ( array[i] === undefined ) { continue; }

How many loops have you seen that make no such check?

Non-Numeric Subscripts

(This heading will reappear when we look at the advantages of arithmetic loops. It cuts both ways.) An array is an object, a collection of name/value pairs. There is no law that says you cannot do this:

var array = [1, 2, 3];
    array['bogus'] = 'loop death';

The enumerated loop will find that "bogus" element and present it to you for your handling. The arithmetic loop will never find it.

Efficiency

No, we are not going to look at some timings "proving" that one loop is better than the other. That "proof" is on its way to obsolete before you view the page.

What we are going to do, however, is stick to first principles: tell your compiler what is to be done, not how to do it. Code written to that principle has the best chance to produce good results if it stays in use for years.

Reasons to Prefer C-Style Loops

hasOwnProperty() Not Needed

If anyone has added anything to Array.prototype a for/in loop will find it and present it to your loop code, which is almost certainly not what you want. This is one of two reasons cited by Crockford (in The Good Parts) for preferring arithmetic loops.

There are two ways to avoid this problem. One is to wrap all for/in loop code with:

for ( var name in object ) {
    if ( object.hasOwnProperty(name) ) {
        /* loop code */
    }
}

If you use Crockford's JSLint you recognize this as an optional check. Some prefer to turn it off, myself included. I refuse to use any library or other foreign code that might add anything to Array.prototype. There are far too many other things that can go wrong if you permit this.

Non-Numeric Property Names

Above we discussed this as an advantage of the for/in loop. Looping through the numeric names of array elements, ignoring any others, may be precisely what you want to do. (The coder who uses non-numeric names in an array is probably the same one who adds properties to Array.prototype. This is another rich source of bugs that should be avoided at all costs, never mind the loop issue.)

Order May Not Be Numeric

The second reason that Crockford cites for using arithmetic loops is that the order of items in an array was not guaranteed. A for/in loop in an older MSIE might present array[5] before array[3], for example. Don't let this scare you.

Browsers, except those from MSIE, order arrays in ascending numeric sequence. MSIE before IE 9 stored arrays in the order the elements were created². If you create an array using an array literal (var array = ['cat', 'dog', 'mouse'] it will be in numeric order in all browsers. Similarly, if you create an empty array and then push values the values will be in numeric order in all browsers. For examples:

// in numeric order
var array0 = ['cat', 'dog', 'mouse'];

// also in numeric order
var array1 = [];
    array1.push( 'first' );
    array1.push( 'second' );

Order is an issue if you add array elements randomly, or if you must process arrays from elsewhere that might have been constructed randomly using MSIE 8 and below:

// MSIE order issue
var array2 = [];
    array2[5] = 'first?';
    array2[3] = 'second?';

Now let's consider some specific cases.

Cases

We'll consider cases from least to most controlled.

Array Content Unknown

Your code is passed an array from elsewhere. You are not sure that the elements were created in order, nor can you be sure that al the property names are numeric. You are in trouble! The arithmetic loop will a) protect you from absurd values, or b) pass over important exceptions. You had better find out which one is true. Good luck.

Array Content Mixed

If the array mixes non-numeric property names with normal numerics, but does so in a controlled, defined way you can pick the loop type intelligently: for/in if you want to look at all properties, arithmetic loop if you want to pass over non-numerics. All browsers tested present properties with non-negative integer subscripts in numeric order, followed by other properties in the order they were added to the array. (Talk to your backend engineers about an appropriate ORDER BY clause for the SQL. The order issue applies only to content generated in the frontend code.)

All Numeric Property Names

If you only have to consider numeric indices, you still have issues to consider. If index numbers are negative, they are treated as non-numeric. If index numbers are fractional, they are treated as non-numeric. If you are getting numbers from user input, use a mask such as /\s*(\d+)\s*/ . Other issues remain.

Sparse Numerics

Can non-contiguous array elements be added? Or can an array, once constructed from contiguously subscripted elements, become sparse due to deletions? In either case, you should process the elemnents with a for/in loop. An arithmetic loop that carefully stops at the last element will be utterly defeated if array.length - 1 does not predict the index of the last element. Worse, sparse arrays may have ghost elements which give a bogus value for array.length (see the Sparse Arrays page). The enumerated loop will succeed with these; the arithmetic loop will fail.

Origin 0

JavaScript, when it decides for itself, starts counting at zero. Dijkstra (famous for insisting "counting begins at zero") would be pleased. However, you are free to challenge Dijkstra, and arrays that other code builds may do so, too. If you are not sure that an array's first subscript is zero, you have to abandon the arithmetic loop, even though you are sure that all properties have contiguous, non-negative integer names.

Negative Subscripts²

Numeric array subscripts are coerced to strings to create the name/value pair that all JavaScript objects require. Small, non-negative integers (and strings that look like small, non-negative integers) create the behaviors described above. Negative numbers (actual numbers and strings starting with hyphens) are treated as if they were not numerics. Arithmetic loops will not find them; for/in loops will treat them normally. They will not create ghosts and they will not count as part of length.

Large Subscripts³

I tested starting with "5" and then adding zeros: "5", "50", etc. These behaved consistently through "500". Length was one more than the subscript and ghosts filled in the blanks preceding the subscript. At "5000" MSIE showed serious bugs and, as I was working on my favorite laptop, I stopped testing. If large subscripts and MSIE are both possible, picking a loop type is not the first problem to solve. Good luck.

Conclusion

Your first choice for iterating through an array should, except in very rare circumstances, be the for/in loop.

Exceptions: The arithmetic loop is preferred when your array may have non-numeric subscripts that you want to ignore. The arithmetic loop may also be needed if you must process, using older versions of MSIE, array elements in numeric subscript order regardless of the order in which they were created.

Otherwise, the enumerated loop (for/in) will be the preferred choice. It will save you from a common class of human errors and it will happily process sparse arrays, even if they don't originate at zero.


All tests performed early July, 2012, with the latest versions of Chrome, Firefox, Internet Explorer and Opera. Except as noted below, there were no browser differences.

¹ The for/in loop has no problems with sparse arrays, but you must understand that sparse arrays have ghost elements. See the Sparse Arrays page to understand the ghosts.

² All tested browsers return array elements in numeric order. Older MSIE returns array elements in the order in which they were created.

Most browsers tested returned subscripted (non-negative integer strings) elements before others. Non-numerics were returned after numerics in the order in which they were added. "2.5" was considered by all to be non-numeric. After inserting "2.5", Firefox apparently decided that further insertions were not to be considered as subscripted entries.

³ MSIE failed to return all elements when a sparse array was created with a large (5000) subscript.

Feedback: MartinRinehart at gmail dot com.

# # #