Even faster String.prototype.trim() implementation in JavaScript.

I’ll begin with a quote:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

Those are the words of one of my favorite hackers, Jamie Zawinski.

Most JavaScript trim() implementations you see around are based on regular expressions and work fairly well for infrequent use on short strings. Some of them are well-crafted, but I think plain old loops can do better. Back in 2007, Steve Levithan covered some really fast implementations of trim() followed by a few more articles by others, including Luca Guidi.

Many of these implementations were based on regular expressions and a few were exceptionally brilliant. But did they answer the question of speed? In one of the comments posted on Steve Levithan’s blog there was a little gem written by Michael Lee Finney, and truly that little function proves to be the fastest of all of the implementations mentioned in those articles. Finney claimed to better previous implementations by 20x, and he was right. His code totally smoked the regular-expression based implementations.

The Details

Arrays in JavaScript behave like hash-tables and hence exhibit the property of being sparse.  Finney’s implementation exploits this feature of JavaScript to its fullest.  Here is a lookup table he uses to mark characters as whitespace:

String.whiteSpace = [];
String.whiteSpace[0x0009] = true;
String.whiteSpace[0x000a] = true;
String.whiteSpace[0x000b] = true;
String.whiteSpace[0x000c] = true;
String.whiteSpace[0x000d] = true;
String.whiteSpace[0x0020] = true;
String.whiteSpace[0x0085] = true;
String.whiteSpace[0x00a0] = true;
String.whiteSpace[0x1680] = true;
String.whiteSpace[0x180e] = true;
String.whiteSpace[0x2000] = true;
String.whiteSpace[0x2001] = true;
String.whiteSpace[0x2002] = true;
String.whiteSpace[0x2003] = true;
String.whiteSpace[0x2004] = true;
String.whiteSpace[0x2005] = true;
String.whiteSpace[0x2006] = true;
String.whiteSpace[0x2007] = true;
String.whiteSpace[0x2008] = true;
String.whiteSpace[0x2009] = true;
String.whiteSpace[0x200a] = true;
String.whiteSpace[0x200b] = true;
String.whiteSpace[0x2028] = true;
String.whiteSpace[0x2029] = true;
String.whiteSpace[0x202f] = true;
String.whiteSpace[0x205f] = true;
String.whiteSpace[0x3000] = true;

The implementation of the function simply indexes into this array to check whether any character encountered while traversing the string is valid whitespace. The function runs noticeably faster. Here’s his code:

function trim14(str) {
    var len = str.length, whiteSpace, i;

    if (!len) {
        return str;
    }

    whiteSpace = String.whiteSpace;

    if (len && whiteSpace[str.charCodeAt(len-1)])
    {
        do{
            --len;
        }
        while (len && whiteSpace[str.charCodeAt(len - 1)]);

        if (len && whiteSpace[str.charCodeAt(0)]){
            i = 1;
            while (i < len && whiteSpace[str.charCodeAt(i)]){
                ++i;
            }
        }
        return str.substring(i, len);
    }
    if (len && whiteSpace[str.charCodeAt(0)]) {
        i = 1;
        while (i < len && whiteSpace[str.charAt(i)]){
            ++i;
        }
        return str.substring(i, len);
    }
    return str;
};

Neatly written. I did spot a few unnecessary checks and removed them for a retest. The following code shaved off a couple more milliseconds sometimes.

function trim16(str) {
    var len = str.length, whiteSpace = String.whiteSpace, i = 0;

    if (len) {
        if (whiteSpace[str.charCodeAt(len - 1)]) {
            // Remove from the end.
            while (--len && whiteSpace[str.charCodeAt(len - 1)]);

            // Remove from the beginning.
            if (len && whiteSpace[str.charCodeAt(0)]){
                i = 1;
                while (i < len && whiteSpace[str.charCodeAt(i)]){
                    ++i;  // Keep this here.
                }
            }
            return str.substring(i, len);
        }

        // Remove from the beginning.
        if (whiteSpace[str.charCodeAt(0)]) {
            i = 1;
            while (i < len && whiteSpace[str.charAt(i)]){
                ++i;
            }
            return str.substring(i, len);
        }
    }

    return str;
};

Can this be any better and faster?

I think it can. There’s repetition in the code that could be removed as well.  I rewrote the function to use Finney’s look up table and ended up with this:

function trim17(str){
    var len = str.length;
    if (len){
        var whiteSpace = String.whiteSpace, i = 0;
        while (whiteSpace[str.charCodeAt(--len)]);
        if (++len){
            while (whiteSpace[str.charCodeAt(i)]){ ++i; }
        }
        str = str.substring(i, len);
    }
    return str;
}

Guess what? Here are the benchmark results from a Chrome developer build running on Linux with 100000 iterations:

Original length: 27663
trim10: 363ms (length: 27656)
trim11: 4668ms (length: 27656)
trim12: 4671ms (length: 27656)
trim13: 322ms (length: 27656)
trim14: 197ms (length: 27656)
trim15: 195ms (length: 27656)
trim16: 197ms (length: 27656)
trim17: 187ms (length: 27656)

trim17 is indeed even faster and shorter.

Updates

Nikolay “MadRabbit” V. Nemshilov, the author of RightJS (a beautiful JavaScript library), provides another implementation of the function:

function trim19(str){
    var str = str.replace(/^\s\s*/, ''),
        ws = /\s/,
        i = str.length;
    while (ws.test(str.charAt(--i)));
    return str.slice(0, i + 1);
}

which is faster than many other regexp-based implementations.

I have included that in the benchmark as well and generated a couple graphs.

Average performance of all trim() implementations across major platforms and browsers.
Average performance of all trim() implementations across major platforms and browsers (Smaller is better).
Average performance of the latter half of the set of trim() implementations.
Average performance of the latter half of the set of trim() implementations (Smaller is better).

Feel free to use my code under the terms of the MIT license. As usual, suggestions and constructive criticism are most welcome.

Advertisements

21 thoughts on “Even faster String.prototype.trim() implementation in JavaScript.”

  1. Hi, can you benchmark jQuery’s built-in $.trim()? I’d be curious to see how it stacks up. If yours smokes it you might want to drop John Resig a line to see about getting it in jQuery.

    1. I have taken the latest trim implementation from jQuery svn and compared that with mine. When it comes to speed, a lot many other implementations (including mine) smoke it.

      jQuery’s original implementation goes like this:

      var jQuery = function( selector, context ) {
      /* … */
      // Used for trimming whitespace
      rtrim = /^\s+|\s+$/g,
      /* … */
      trim: function( text ) {
      return (text || “”).replace( rtrim, “” );
      },
      /* … */

      Here’s trim18 (jQuery):

      var rtrim = /^\s+|\s+$/g;
      function trim18(text) {
      return (text || “”).replace(rtrim, “” );
      }

      Running the benchmark for only 200 iterations produced this:

      Original length: 27662
      trim10: 2ms (length: 27656)
      trim11: 20ms (length: 27656)
      trim12: 14ms (length: 27656)
      trim13: 2ms (length: 27656)
      trim14: 1ms (length: 27656)
      trim15: 1ms (length: 27656)
      trim16: 1ms (length: 27656)
      trim17: 0ms (length: 27656)
      trim18: 96ms (length: 27656)

      on a Chrome developer build running on Linux. When I tried running the benchmark with 100000 iterations, the browser prompted me twice to “kill the offending script”.

      I’ve seen jQuery.trim() used in the liveSearch plugin as well, and wonder if that is one of the possible reasons why that plugin is slow for even a moderate number of ‘li’ child items.

  2. My implementation is a bit slower when running on Firefox on Windows. Apparently, trim17 has fairly consistent behavior across different browsers. Perhaps, a better method would be to not define String.prototype.trim for Firefox since Firefox 3.5 now comes with a native implementation. Also regular expression based implementations run a lot faster on Firefox than on Webkit based browsers.

    So the quest for the fastest trim function isn’t over just yet!

    I’ll post updates as soon as I can.

  3. I think you still can improve this.
    You could check start and end of the string at the same time, thus avoiding some iterations.
    Something like :
    function trim (s) {
    var x=' ',y=' ';
    for(var j=s.length, i=0, k=j-1; ii;) {
    if(x === ' ') x = s.charAt(i);
    if(y === ' ') y = s.charAt(k);
    if(x !== ' ' && y !== ' ') break;
    if(x === ' ') i++;
    if(y === ' ') k--;
    }
    return s.substring(i,k+1);
    }

    I guess it’s even more noticeable when triming “higly trimable” strings like ( x=’ x ‘); )

  4. … wordpress …
    for(var j=s.length, i=0, k=j-1; ii;) {
    should be
    for(var j=s.length, i=0, k=j-1; ii;) {
    for(var j=s.length, i=0, k=j-1; i < j, k > i;) {

  5. Hmm, I see this as purely academic endeavor. On a typical ECMAScript usage, you simply won’t have to do that many trimmings to see noticeable difference in speed.

    That said, if it’s implemented in a js framework like jQuery, it’s great! But I’d left trimming optimizations to js engine developers, and continue to write readable scripts using REs for now…

    1. @N::
      Newer JavaScript engines can optimize loops a lot better than older ones. So, to my mind, the faster the browsers get, the faster some of these implementations will become.

      Also, as a matter of fact, this started because a bunch of code at work was calling jQuery.trim() repeatedly and caused the browser to lockup quite often. A guided replacement of the implementation reduced that. We also found and trashed a lot of spurious calls to trim(), but that’s another story.

      The bottom line is that if trim() can be this fast, why not have it in the libraries where it truly belongs? Hopefully, the good people out there writing such beautiful libraries will pick up faster implementations (not necessarily mine in particular, though) so all those little plugins that depend on trim() can run a bit or a lot (depending on how it’s used) faster.

      Cheers,
      Yesudeep.

  6. Thanks for correcting me on your real name, Yesudeep. I guess I’m a little too anal when it comes to correct attribution. I’ve posted changes. Please accept my humble apologies for a rather brain-dead rant (though, I couldn’t find anywhere on your site that had an indication the subdomain was, in fact, you!).

    You’ve got a lot of great info here. This site’s going in my permanent bookmarks.

  7. In my tests a simple loop performs fastest, can you post the code you’re using to benchmark? I’d love to be able to compare results more accurately.


    function trim_loop(str) {
    for(var i=0;i=0 && str[j] == ' ';j--);
    return str.substring(i,j+1);
    }

  8. Alright, wordpress killed it,
    http://dpaste.com/hold/89276/

    Or maybe this will work…? dpaste above is a safe bet

    function trim_loop(str) {
    for(var i=0; str.length && str[i] == ‘ ‘;i++);
    for(var j=str.length-1;j>=0 && str[j] == ‘ ‘;j–);
    return str.substring(i,j+1);
    }

    1. Hi Matt,

      The simple loop would indeed perform faster if you didn’t have to look up space characters ever. While the trim_loop() function that you have posted above might turn out to be faster, it will not check for all the space characters we might be concerned with when trimming. For example, str[i] == ‘ ‘ will not catch an end of line character or a tab character, etc. If spaces are all you really need to trim, then by all means, your code is the most appropriate.

      There is one other thing you can do with that loop to optimize it further actually. str.length (in the first loop) is a loop invariant and can be moved outside of the loop by calculating it only once instead of doing it per iteration.

      Again, your code is most appropriate when you only want to trim space characters. As soon as you add a look up table, it will probably wind up looking similar to mine.

      About the benchmarks, I will be posting updates and links soon about it and providing a repository so anybody can peek at non-wordpress-killed code.

      Cheers,
      Yesudeep. 🙂

  9. Of course! I feel very silly. Thanks for pointing all that out.

    I noticed that moving the “trim_loop” to string.prototype instead of just being a standalone function slowed it down tremendously. Do you have any thoughts on this?

    Looking forward to any updates you might have!

  10. […] further information visit: Even faster String.prototype.trim() implementation in JavaScript. Faster JavaScript Trim. Tags: RegEx, string, trim Category: Javascript […]

    1. That is jquery’s method (trim 18), and his tests indicated it was (72 +/- 24) x SLOWER than his fastest implementations.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s