Functional Programming for Sorts in JavaScript

© 2012 Martin Rinehart

Prerequisite: Good understanding of JavaScript basics. This page is introduced late in JavaScript with Native Objects, Volume III of the five-volume Frontend Engineering series.

"Functional programming" is a scary topic for many JavaScript newcomers. It shouldn't be. It just means that functions can be passed around just like strings, arrays or other objects. Jargon: in JavaScript, functions are "first-class objects." Sounds impressive. It really means that functions can be passed around just like strings, arrays or other objects.

Why would you want to pass functions around? The more often you use functional programming, the more often you'll see problems that can be solved, and solved easily, using functional programming. A common first need is to sort things.

Sorting in JavaScript: Already Programmed for You!

Sorting was one of the first computer algorithms to be explored in depth. Some of the earliest programmers wrote sorts, each one a bit better than the last. Today, sorting is built into every modern language. There's really no need to program your own sort (and there's little likelihood that your own sort routine would be as good as the ones your language provides).

Let's look at an example.

<!doctype html>
<html><head><title>Sort Tests!</title>

</head><body>

    <h1>Sort Tests!</h1>

    <script>

var arr = [ 2, 3, 1 ];

alert( arr.sort() ); // 1, 2, 3

</script></body></html>

To master this topic, copy this template into your own page file, open it in one of your browsers and see the results for yourself. From here on, we'll highlight the changes and you can enter them yourself (and fiddle with any that aren't immediately obvious). (You'll probably want to comment out the arrays, not replace them. You'll need them again soon enough.)

As you see here, sorting an array of numbers is as simple as calling the built-in sort() method. At least it's that simple if you don't need to sort from high to low. For that, you could write a reverse() function and reverse the result, but there's a better way. Let's look at another example first.

    <script>

var arr =
    [ 'cherry', 'apple', 'blueberry' ];

alert( arr.sort() );
    // apple, blueberry, cherry

</script></body></html></div>

Go ahead and try a new array in your own test page with a few things you enjoy. Have a steak, a cake or a ... Well, you decide. Got it?

Now let's cause some trouble. The default JavaScript sort for strings is in ASCII order. "A" through "Z" all come before "a" through "z", which is very seldom what you want. Consider this:

var arr = [ 'Tom', 'dick', 'Harry' ];

alert( arr.sort() ); // Harry, Tom, dick

You might have been thinking that "dick" should precede "Harry". Sorry.

To get it to sort right, you could convert the array to all lowercase (or UpperCase, if you prefer). If you needed the original sorted, however, it would be pretty hard to reverse that conversion. Still we can make this worse. One more and we'll start to solve problems, not just create them:

var arr = [ "2", "10", "9" ];

alert( arr.sort() ); // 10, 2, 9

Say what? Ten is less than two?

Actually, that's not the result you got. The result was that "10" is less than "2", which is a comparison between strings, not numbers. Strings are given an ASCII-ordered, lexicographic sort. "Lexicographic" means the order you would find in a dictionary. Remember that a dictionary sort (in English and other western languages) compares the leftmost characters in a word. If they differ, the lower one comes first and the comparison is done. If they are the same, the process repeats with the next character. An "apple 1" comes before "apple 2" (as decideded at the first non-identical characters). A "1" comes before a "2". A "10" also comes before a "2" as "1" comes before "2".

So a lexicographic sort of numeric strings is really most unsatisfactory. A case-insensitive sort of strings is often unsatisfactory. You may want to sort from high to low, which the default sort doesn't support.

The built-in sort very often fails you when you get to the real world. (In the library, the author catalog is sorted by last name. Should "MacPherson" come before "McPherson"? In a pure lexicographic sort, it would, but in a library the two are considered identical because the library's visitor probably has no idea about how they are spelled. Test: spell the name of Apple's famous computer and America's famous apple—the one spread by Johnny Appleseed. See the problem? (Answer below.)

Comparison Rescue!

All this is why modern sort routines allow you to replace the compare function that is buried somewhere deep inside them. You might originally think of some simple test, like if (a > b) to compare array elements, but that's not going to solve any of the problems above. Replace it with a function, like this: compare( a, b ). The compare() function in many languages returns one, zero or minus one. (Often it returns positive, zero or negative, but you'll commonly code it with ones and zeros. We'll get there soon.)

Let's begin by coding a very simple compare, and using it to repeat our very first test.

var arr = [ 2, 3, 1 ];

function compare( a, b ) {
    if ( a > b ) return 1;
    if ( b > a ) return -1;
    return 0;
}

alert( arr.sort(compare) ); // 1, 2, 3

Don't overlook that call to the sort() method. The method is called with compare (a reference to the function you wrote): arr.sort(compare). Your function will be the one used inside the sort method.

If you got this right, you should now have successfully sorted 1, 2 and 3 (which you could probably have done without a computer program). So what? We'll illustrate this, but first a quick improvement. If all we needed was a numeric compare() we could have written it this way:

var arr = [ 2, 3, 1 ];

function compare( a, b ) {
    return a - b;
}

alert( arr.sort(compare) ); // 1, 2, 3

Remember we need a function that returns positive, zero or negative (though we often use one, zero and minus one). If a is greater than b, the expression a - b will be positive. Think about the other two cases to convince yourself that this is all you need.

With that trick, here's a compare() that will sort numbers from high to low:

var arr = [ 2, 3, 1 ];

function compare( a, b ) {
    return b - a;
}

alert( arr.sort(compare) ); // 3, 2, 1

All you did was change a - b to b - a and you reversed the sort order. You should now be getting a sneaky feeling that this compare() function might solve some serious problems. It does.

Let's start with a case-insensitive, lexicographic sort. (What a lot of syllables! Luckily it doesn't take much code.) Just convert to all UPPER or lowercase, before you compare. Note that you aren't converting the array members, just the parameters inside the function. This will do the job:

var arr = [ 'Tom', 'dick', 'Harry' ];

function compare_case_insensitive( a, b ) {
    a = a.toLowerCase();
    b = b.toLowerCase();

    if ( a > b ) { return  1; }
    if ( b > a ) { return -1; }
    return 0;
}

alert( arr.sort(compare_case_insensitive) );
    // dick, Harry, Tom

Here we've changed the name from compare() to compare_case_insensitive(). It's not the name of the function that matters; it's the fact that we pass it as an argument when we call the sort() method. A longer name will, for this one, help when we file it away for future use. For a test, why don't you modify your case-insensitive lexicographic sort to work from high to low. (Transposing a and b will work, but so will moving a single character.)

Ready to write a sort for strings containing integers, so "10" will sort after "9"? Try it on your own, and then check our "Coding Challenge" answer at the bottom of this page. Don't start typing until you see that it's pretty simple.

Summary

Writing a compare function and passing it to the sort() method lets you control your sorts to meet lots of diverse needs. And it still saves you from ever having to write your own sort() routine, which is not a simple thing to do, and which won't, when you are done, solve any of these problems: you'll still need a variety of compare() methods.

Next on our "to do" list? We're going to sort the definitions in the various HTML standards. To begin, we're going to be sorting records in a table, not just single values, which means we've got to bore into the record to drill out the location in, for example, the HTML 4.01 standard. It might be section 8.9.10. In a standards document, that's the section between 8.9.9 and 8.9.11, of course. (Put that in your compare() and sort it!) And like most data processing sorts, if two records are found for section 8.9.10, we've got to be careful to keep them in their original order. We won't be bored!


Coding Challenge: Here's our integer-string sort:

var arr = [ '2', '10', '9' ];

function compare_integer_strings( a, b ) {
    while ( a.length < b.length ) {
            a = '0' + a; }
    while ( a.length > b.length ) {
           b = '0' + b; }

    if ( a > b ) { return  1; }
    if ( b > a ) { return -1; }
    return 0;
}

alert( arr.sort(compare_integer_strings) );
    // 2, 9, 10

We changed the compare function's name again, so we could add it to our collection. Aside from that, just changing the first two lines so that they padded the shorter string with "0"s was all that was needed. In a lexicographic sort, "09" is less than "10".

If your significant other comes by and asks what you are doing, try, "I'm working on improved lexicographic sorts." That sounds a lot more impressive than "I'm trying to get nine to come before ten."


Test answer: Fans of each call them a "Mac," but for a delicious pie, use the McIntosh apple, not the Apple Macintosh.


P.S. An example, sorting a list of online resources. Added April, 2015.

A Practical Example

The Java language defines an interface, Comparable, that a class can implement if it wants its members to be compared to each other (for example, to sort them). It is a simple interface, with one method: compareTo(). Not surprisingly, the compareTo() method (called with a.compareTo(b), where a and b are both instances of any one Comparable class) returns positive, zero or negative, for a greater than, equal to or less than b.

In Oracle's Java docs (as of 2 April, 2015) over 100 Java classes implement Comparable. Java uses it for BigInteger, file dates, XML access orders and many more. We can implement it in JavaScript, too.

A Comparable compare() Function

First, we'll need a generic compare() function to pass to the sort method. This will do the trick:

function compare( a, b ) { return a.compareTo(b); }

Now we can sort() an array of any object type that implements compareTo(), which makes the type Comparable.

Comparing URLs

The immediate problem was to sort a list of URLs. We want to sort by domain name within top-level domains (TLDs) so that the .COMs come before the .EDUs, for example. We want to additionally sort all the MIT.COM URLs by subdomains so that "bar.mit.edu" precedes "foo.mit.edu", and so on.

To do this we can use an OOP approach, creating a ComparableURL class. One way to do this is to store the URL as a string, and then break it into an array of items between the periods that separate domains and subdomains. Reversing the domain array will make the sorting simpler: "foo.mit.edu" becomes ["edu", "mit", "foo"].

This class function (constructor) will do the trick:

function ComparableURL( str ) {
    var new_C_URL = this;

    new_C_URL.string = str;
    new_C_URL.domains = str.split( '.' ).reverse();

} // end: ComparableURL()

Whenever you can, doing a little preparation (like our split() and reverse()) will save lots of work when it's time to sort.

Now, we need a compareTo() method. Here's one, attached to the class constructor's prototype property:

ComparableURL.prototype.compareTo = function ( b ) {
    var a = this; // 'a' in 'a.compareTo(b)'

    var i = 0;
    while ( true ) {
        var adom = a.domains[ i ], bdom = b.domains[ i ];

        // past the end (undefined) of one or both domain lists?
        if ( adom === undefined ) {
            if ( bdom === undefined ) { return 0; }
            /* else */ return -1; // b is longer
        }
        /* else */ if ( bdom === undefined ) { return  1; } // a is longer

        // are the [sub]domains unequal?
        if ( adom > bdom ) { return  1; }
        if ( bdom > adom ) { return -1; }

        // else proceed to the next [sub]domain
        i += 1;
    }

} // end: ComparableURL.compareTo()

Note that more than half the work is done when one or both URL's domain lists are empty. (You want "mit.edu" to come before "foo.mit.edu.")

This compareTo() method correctly sorts:

Input

foo.com
bar.com
this.edu
that.edu
mit.edu
cmu.edu
foo.mit.edu
bar.mit.edu
extra.bar.mit.edu
another test item
Output

another test item
bar.com
foo.com
cmu.edu
mit.edu
bar.mit.edu
extra.bar.mit.edu
foo.mit.edu
that.edu
this.edu

Feedback: MartinRinehart at gmail dot com.

# # #