Faster JavaScript Trim

Since JavaScript doesn't include a trim method natively, it's included by countless JavaScript libraries – usually as a global function or appended to String.prototype. However, I've never seen an implementation which performs as well as it could, probably because most programmers don't deeply understand or care about regex efficiency issues.

After seeing a particularly bad trim implementation, I decided to do a little research towards finding the most efficient approach. Before getting into the analysis, here are the results:

Method Firefox 2 IE 6
trim1 15ms < 0.5ms
trim2 31ms < 0.5ms
trim3 46ms 31ms
trim4 47ms 46ms
trim5 156ms 1656ms
trim6 172ms 2406ms
trim7 172ms 1640ms
trim8 281ms < 0.5ms
trim9 125ms 78ms
trim10 < 0.5ms < 0.5ms
trim11 < 0.5ms < 0.5ms

Note 1: The comparison is based on trimming the Magna Carta (over 27,600 characters) with a bit of leading and trailing whitespace 20 times on my personal system. However, the data you're trimming can have a major impact on performance, which is detailed below.

Note 2: trim4 and trim6 are the most commonly found in JavaScript libraries today.

Note 3: The aforementioned bad implementation is not included in the comparison, but is shown later.

The analysis

Although there are 11 rows in the table above, they are only the most notable (for various reasons) of about 20 versions I wrote and benchmarked against various types of strings. The following analysis is based on testing in Firefox 2.0.0.4, although I have noted where there are major differences in IE6.

  1. return str.replace(/^\s\s*/, '').replace(/\s\s*$/, '');
    All things considered, this is probably the best all-around approach. Its speed advantage is most notable with long strings — when efficiency matters. The speed is largely due to a number of optimizations internal to JavaScript regex interpreters which the two discrete regexes here trigger. Specifically, the pre-check of required character and start of string anchor optimizations, possibly among others.
  2. return str.replace(/^\s+/, '').replace(/\s+$/, '');
    Very similar to trim1 (above), but a little slower since it doesn't trigger all of the same optimizations.
  3. return str.substring(Math.max(str.search(/\S/), 0), str.search(/\S\s*$/) + 1);
    This is often faster than the following methods, but slower than the above two. Its speed comes from its use of simple, character-index lookups.
  4. return str.replace(/^\s+|\s+$/g, '');
    This commonly thought up approach is easily the most frequently used in JavaScript libraries today. It is generally the fastest implementation of the bunch only when working with short strings which don't include leading or trailing whitespace. This minor advantage is due in part to the initial-character discrimination optimization it triggers. While this is a relatively decent performer, it's slower than the three methods above when working with longer strings, because the top-level alternation prevents a number of optimizations which could otherwise kick in.
  5. str = str.match(/\S+(?:\s+\S+)*/);
    return str ? str[0] : '';

    This is generally the fastest method when working with empty or whitespace-only strings, due to the pre-check of required character optimization it triggers. Note: In IE6, this can be quite slow when working with longer strings.
  6. return str.replace(/^\s*(\S*(\s+\S+)*)\s*$/, '$1');
    This is a relatively common approach, popularized in part by some leading JavaScripters. It's similar in approach (but inferior) to trim8. There's no good reason to use this in JavaScript, especially since it can be very slow in IE6.
  7. return str.replace(/^\s*(\S*(?:\s+\S+)*)\s*$/, '$1');
    The same as trim6, but a bit faster due to the use of a non-capturing group (which doesn't work in IE 5.0 and lower). Again, this can be slow in IE6.
  8. return str.replace(/^\s*((?:[\S\s]*\S)?)\s*$/, '$1');
    This uses a simple, single-pass, greedy approach. In IE6, this is crazy fast! The performance difference indicates that IE has superior optimization for quantification of "any character" tokens.
  9. return str.replace(/^\s*([\S\s]*?)\s*$/, '$1');
    This is generally the fastest with very short strings which contain both non-space characters and edge whitespace. This minor advantage is due to the simple, single-pass, lazy approach it uses. Like trim8, this is significantly faster in IE6 than Firefox 2.

Since I've seen the following additional implementation in one library, I'll include it here as a warning:

return str.replace(/^\s*([\S\s]*)\b\s*$/, '$1');

Although the above is sometimes the fastest method when working with short strings which contain both non-space characters and edge whitespace, it performs very poorly with long strings which contain numerous word boundaries, and it's terrible (!) with long strings comprised of nothing but whitespace, since that triggers an exponentially increasing amount of backtracking. Do not use.

A different endgame

There are two methods in the table at the top of this post which haven't been covered yet. For those, I've used a non-regex and hybrid approach.

After comparing and analyzing all of the above, I wondered how an implementation which used no regular expressions would perform. Here's what I tried:

function trim10 (str) {
	var whitespace = ' \n\r\t\f\x0b\xa0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000';
	for (var i = 0; i < str.length; i++) {
		if (whitespace.indexOf(str.charAt(i)) === -1) {
			str = str.substring(i);
			break;
		}
	}
	for (i = str.length - 1; i >= 0; i--) {
		if (whitespace.indexOf(str.charAt(i)) === -1) {
			str = str.substring(0, i + 1);
			break;
		}
	}
	return whitespace.indexOf(str.charAt(0)) === -1 ? str : '';
}

How does that perform? Well, with long strings which do not contain excessive leading or trailing whitespace, it blows away the competition (except against trim1/2/8 in IE, which are already insanely fast there).

Does that mean regular expressions are slow in Firefox? No, not at all. The issue here is that although regexes are very well suited for trimming leading whitespace, apart from the .NET library (which offers a somewhat-mysterious "backwards matching" mode), they don't really provide a method to jump to the end of a string without even considering previous characters. However, the non-regex-reliant trim10 function does just that, with the second loop working backwards from the end of the string until it finds a non-whitespace character.

Knowing that, what if we created a hybrid implementation which combined a regex's universal efficiency at trimming leading whitespace with the alternative method's speed at removing trailing characters?

function trim11 (str) {
	str = str.replace(/^\s+/, '');
	for (var i = str.length - 1; i >= 0; i--) {
		if (/\S/.test(str.charAt(i))) {
			str = str.substring(0, i + 1);
			break;
		}
	}
	return str;
}

Although the above is a bit slower than trim10 with some strings, it uses significantly less code and is still lightning fast. Plus, with strings which contain a lot of leading whitespace (which includes strings comprised of nothing but whitespace), it's much faster than trim10.

In conclusion…

Since the differences between the implementations cross-browser and when used with different data are both complex and nuanced (none of them are faster than all the others with any data you can throw at it), here are my general recommendations for a trim method:

  • Use trim1 if you want a general-purpose implementation which is fast cross-browser.
  • Use trim11 if you want to handle long strings exceptionally fast in all browsers.

To test all of the above implementations for yourself, try my very rudimentary benchmarking page. Background processing can cause the results to be severely skewed, so run the test a number of times (regardless of how many iterations you specify) and only consider the fastest results (since averaging the cost of background interference is not very enlightening).

As a final note, although some people like to cache regular expressions (e.g. using global variables) so they can be used repeatedly without recompilation, IMO this does not make much sense for a trim method. All of the above regexes are so simple that they typically take no more than a nanosecond to compile. Additionally, some browsers automatically cache the most recently used regexes, so a typical loop which uses trim and doesn't contain a bunch of other regexes might not encounter recompilation anyway.


Edit (2008-02-04): Shortly after posting this I realized trim10/11 could be better written. Several people have also posted improved versions in the comments. Here's what I use now, which takes the trim11-style hybrid approach:

function trim12 (str) {
	var	str = str.replace(/^\s\s*/, ''),
		ws = /\s/,
		i = str.length;
	while (ws.test(str.charAt(--i)));
	return str.slice(0, i + 1);
}

New library: Are you a JavaScript regex master, or want to be? Then you need my fancy XRegExp library. It adds new regex syntax (including named capture and Unicode properties); s, x, and n flags; powerful regex utils; and it fixes pesky browser inconsistencies. Check it out!

175 thoughts on “Faster JavaScript Trim”

  1. Steve, I have a couple of questions about the 2008-02-04 update:

    1) What happens in the empty string case? Seems like “i” is -1 and bad stuff should happen?

    2) What’s the licensing around this? I’d like to use it in GWT (we’re Apache 2.0).

    Thanks,
    Scott

  2. @Scott Blum:

    1. An i value of -1 works fine because charAt returns an empty string if the provided index is out of range.

    2. Any code I post on my blog is under the MIT license unless otherwise specified. But something this simple I’d consider public domain. You’re certainly very welcome to use it in GWT.

  3. Hello, nice article. Using the benchmark page with Opera 9.5 beta snapshot build 9755 on Windows, I got some interesting results.

    With 20 iterations the methods 1-8 were between 150 and 450 ms, trim9 took almost 4 seconds! trim10 and trim12 were 0 ms every time, but trim11 was at 15 ms.

    With 100 iteration the methods 1-8 were quite similar as with 20 iterations, but the values five times larger naturally. Similarly trim9 was almost 19 seconds (slow!). Still trim10 gave a nice result of 0 ms, while both trim11 and trim12 were 47 ms.

    With 200 iterations I finally managed to make trim10 take more than 0 ms of time, it took 15 ms in one run, which itself took almost a minute to run, thanks to trim9’s slow 39 seconds of execution time! Here, however, trim12 was slower than trim11 for some reason, while with 20 iterations it was always a lot faster.

    So, at least with Opera 9.5 it seems like the method 10 is clearly superior to all of the other methods.

  4. @Joonas Lehtolahti, that may be true with the provided test data, but this post makes it clear that none of the methods are fastest with all types of strings.

    Trim9 is slow in Opera because that browser is particularly bad at lazy quantification.

  5. It seems to me that the right trim of #1 is a little out of order. The current #1 trim shows:

    str.replace(/^\s\s*/, ”).replace(/\s\s*$/, ”);

    Shouldn’t it be changed to:

    str.replace(/^\s\s*/, ”).replace(/\s*\s$/, ”);

    It is symmetric with the left trim and it would seem that a pre-check optimization could potentially be more easily performed by using only the last character. [I admit that I might be over-thinking this…]

  6. @Chris Akers, you are assuming that such a pre-check would account for position relative to particular anchors within the string. That is unlikely. But more importantly, you would potentially be making the second regex take longer to fail at positions prior to string-ending whitespace. That tiny bit of extra time (assuming the absence of certain types of other potential internal optimizations) would add up over the course of checking every position in a long string.

  7. What about something like this?

    String.prototype.trim = function(){
    var s = /\s*([\S+\s*]*\S+)+\s*/i.exec(this);
    return (!s) ? ” : s[1];
    }

  8. @Sean, I think you fundamentally misunderstand how your regular expression works. For starters, you should read up on regex character class syntax. With all whitespace, your regular expression manages to be orders of magnitude worse performing than every other regular expression on this page, including the one that was so bad I had to leave it out of the speed tests.

  9. @Luca Guidi, very nice list of browser results. However, the copy of my benchmark page you’re providing changes the test in a subtle but important way — it removes all the leading whitespace, and all but the very last trailing whitespace character. That means the test methods in your while loops run fewer times.

    The more leading whitespace you have, the slower that testing each character will be when compared to removing it all in one shot with replace. It’s not accurate to claim (without qualifiers) that your implementation is “the fastest one,” because that doesn’t hold true with certain types of subject data. According to my brief test in Firefox 2.0.0.14, with enough leading whitespace your implementation becomes much slower than all the others.

  10. @Steve, I’m very sorry for this, I just downloaded your page and added my version of the trim function.
    I didn’t noticed the missing whitespaces, maybe the browser stripped them.
    I’ll run again all the test suite, then update the results.
    Thanks for the hint!

  11. I liked trim10!
    I was going to mention about the i >=0 en the second loop.
    There is no bug in i > 0, it is how the algorithm was thought. You are now checking twice for character (0).
    Thanks for this beautiful page!

  12. I’ve found that your hybrid solution, along with a few of the others, is flawed in IE6. A single space ' ' will not be trimmed in IE6 but will in FF3. I’m a little stumped right now so I don’t have any fixes for you. I can’t see why the first replace, str.replace(/^\s\s*/, '') isn’t picking it up in IE6

  13. Hi, folks!

    This is a very good work, thanks for sharing! I’ve just tweaked it a little, so that you can call the trim function as if it were a member of String. You just have to add this:

    String.prototype.trim = function()
    {
    var str = this;
    str = str.replace(/^\s\s*/, ”),
    ws = /\s/,
    i = str.length;
    while (ws.test(str.charAt(–i)));
    return str.slice(0, i + 1);
    }

    This way, you can do like:

    var myString = ” bla bla bla “;
    alert(myString.trim());

  14. Can you try this method :

    function trim(str) {
    return str.replace(/^\s*(\b.*\b|)\s*$/, “$1”);
    }

    I found this method in Rialto framework.

    Thanks,

  15. I tested my version of trim() and found that it is as much as 3 times faster than the code here.

    String.whiteSpace =
    {
    “\u0009” : true, “\u000a” : true, “\u000b” : true, “\u000c” : true, “\u000d” : true, “\u0020” : true, “\u0085” : true,
    “\u00a0” : true, “\u1680” : true, “\u180e” : true, “\u2000” : true, “\u2001” : true, “\u2002” : true, “\u2003” : true,
    “\u2004” : true, “\u2005” : true, “\u2006” : true, “\u2007” : true, “\u2008” : true, “\u2009” : true, “\u200a” : true,
    “\u200b” : true, “\u2028” : true, “\u2029” : true, “\u202f” : true, “\u205f” : true, “\u3000” : true
    };

    /*
    * Trim spaces from a string on the left and right.
    */

    trim13 = function(str)
    {
    str = String(str);

    var n = str.length;
    var s;
    var i;

    if (!n)
    return str;
    s = String.whiteSpace;
    i = 0;
    if (n && s[str.charAt(n-1)])
    {
    do
    {
    –n;
    }
    while (n && s[str.charAt(n-1)]);
    if (n && s[str.charAt(0)])
    do
    {
    ++i;
    }
    while (i < n && s[str.charAt(i)]);
    return str.substring(i, n);
    }
    if (n && s[str.charAt(0)])
    {
    do
    {
    ++i;
    }
    while (i < n && s[str.charAt(i)]);
    return str.substring(i, n);
    }
    return str;
    };

  16. I have significantly improved the previous version. I also added tests for strings with no leading / trailing spaces because it is extremely common to trim strings that do not need triming. The improved version is faster than trim1 – trim12 on all tested browsers (chrome, ie 6,7,8, ff 2,3,3b, opera 9.62, safari 3.1.2) for all test cases except that FF 3x browsers sometimes are slightly faster with a regexp when spaces have to be trimmed. However, it is not always the SAME regexp. In many cases, the speedup is as much as a factor of 20 over the regexp methods.

    String.whiteSpace = [];
    String.whiteSpace[0x0009] = true;
    String.whiteSpace[0x000a] = true;
    String.whiteSpace[0x000b] = true;
    String.whiteSpace[0x000c] = true;
    String.whiteSpace[0x000d] = true;
    String.whiteSpace[0x0020] = true;
    String.whiteSpace[0x0085] = true;
    String.whiteSpace[0x00a0] = true;
    String.whiteSpace[0x1680] = true;
    String.whiteSpace[0x180e] = true;
    String.whiteSpace[0x2000] = true;
    String.whiteSpace[0x2001] = true;
    String.whiteSpace[0x2002] = true;
    String.whiteSpace[0x2003] = true;
    String.whiteSpace[0x2004] = true;
    String.whiteSpace[0x2005] = true;
    String.whiteSpace[0x2006] = true;
    String.whiteSpace[0x2007] = true;
    String.whiteSpace[0x2008] = true;
    String.whiteSpace[0x2009] = true;
    String.whiteSpace[0x200a] = true;
    String.whiteSpace[0x200b] = true;
    String.whiteSpace[0x2028] = true;
    String.whiteSpace[0x2029] = true;
    String.whiteSpace[0x202f] = true;
    String.whiteSpace[0x205f] = true;
    String.whiteSpace[0x3000] = true;

    /*
    * Trim spaces from a string on the left and right.
    */

    trim13 = function(str)
    {
    var n = str.length;
    var s;
    var i;

    if (!n)
    return str;
    s = String.whiteSpace;
    if (n && s[str.charCodeAt(n-1)])
    {
    do
    {
    –n;
    }
    while (n && s[str.charCodeAt(n-1)]);
    if (n && s[str.charCodeAt(0)])
    {
    i = 1;
    while (i < n && s[str.charCodeAt(i)])
    ++i;
    }
    return str.substring(i, n);
    }
    if (n && s[str.charCodeAt(0)])
    {
    i = 1;
    while (i < n && s[str.charAt(i)])
    ++i;
    return str.substring(i, n);
    }
    return str;
    };

  17. function trimstr(str)
    {
    var whitespace= " \n\r\t\f";

    for( var i= 0; i < str.length; i++ )
    if( whitespace.indexOf( str.charAt(i) ) < 0 )
    break;

    for( var j= str.length – 1; j >= i; j– )
    if( whitespace.indexOf( str.charAt(j) ) < 0 )
    break;

    return str.substring( i,j+1 );
    }

  18. Here’s an even faster implementation of trim() based on Michael Finney’s lookup table.

    function trim(str){
    var len = str.length;
    if (len){
    var whiteSpace = String.whiteSpace;
    while (whiteSpace[str.charCodeAt(–len)]);
    if (++len){
    var i = 0;
    while (whiteSpace[str.charCodeAt(i)]){ ++i; }
    }
    str = str.substring(i, len);
    }
    return str;
    }

  19. WOW thats a nice piece of info… i used the first one… and its working fine… thanks for the snippet…

  20. Steve,
    I seem to have some problem with HTMLEncoding since function trim12(str) returns values like:

    <img display=”image.gif” />

    instead of what I expect which is:

    Over other implementations, your function is working extremely well!!!

  21. Trim? TRIM!?!?

    Here it is, the 21st Century, and we’re talking about implementations of TRIM for g–sakes! Why isn’t such a boneheadely simple, useful tool part of the JS language itself? It’s like getting excited over the best way to implement screen – been around since what? 1970s?

    Javascript is “the new way” but its shortcomings are soooo painful.

  22. Seems this won’t be needed in the future as ECMAScript Ed. 5 includes String.prototype.trim. And of course all browser vendors will rush to implement it immediately. 🙂

    Otherwise, an interesting post. It may appear to be premature optimisation, however it shows that while some algorithms are very fast in some browsers, they may also be very slow in others. It’s good to use an algorithm that is reasonably fast in a representative cross-seciton.

  23. I read this article some years ago and found it very useful, I will use trim12 because of shorter code and still good performance, thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *