Faster JavaScript Trim
Since JavaScript doesn't include a trim
method natively, it's included by countless JavaScript libraries – usually as a global function or appended to String.prototype
. However, I've never seen an implementation which performs as well as it could, probably because most programmers don't deeply understand or care about regex efficiency issues.
After seeing a particularly bad trim
implementation, I decided to do a little research towards finding the most efficient approach. Before getting into the analysis, here are the results:
Method | Firefox 2 | IE 6 |
---|---|---|
trim1 | 15ms | < 0.5ms |
trim2 | 31ms | < 0.5ms |
trim3 | 46ms | 31ms |
trim4 | 47ms | 46ms |
trim5 | 156ms | 1656ms |
trim6 | 172ms | 2406ms |
trim7 | 172ms | 1640ms |
trim8 | 281ms | < 0.5ms |
trim9 | 125ms | 78ms |
trim10 | < 0.5ms | < 0.5ms |
trim11 | < 0.5ms | < 0.5ms |
Note 1: The comparison is based on trimming the Magna Carta (over 27,600 characters) with a bit of leading and trailing whitespace 20 times on my personal system. However, the data you're trimming can have a major impact on performance, which is detailed below.
Note 2: trim4
and trim6
are the most commonly found in JavaScript libraries today.
Note 3: The aforementioned bad implementation is not included in the comparison, but is shown later.
The analysis
Although there are 11 rows in the table above, they are only the most notable (for various reasons) of about 20 versions I wrote and benchmarked against various types of strings. The following analysis is based on testing in Firefox 2.0.0.4, although I have noted where there are major differences in IE6.
return str.replace(/^\s\s*/, '').replace(/\s\s*$/, '');
All things considered, this is probably the best all-around approach. Its speed advantage is most notable with long strings — when efficiency matters. The speed is largely due to a number of optimizations internal to JavaScript regex interpreters which the two discrete regexes here trigger. Specifically, the pre-check of required character and start of string anchor optimizations, possibly among others.return str.replace(/^\s+/, '').replace(/\s+$/, '');
Very similar totrim1
(above), but a little slower since it doesn't trigger all of the same optimizations.return str.substring(Math.max(str.search(/\S/), 0), str.search(/\S\s*$/) + 1);
This is often faster than the following methods, but slower than the above two. Its speed comes from its use of simple, character-index lookups.return str.replace(/^\s+|\s+$/g, '');
This commonly thought up approach is easily the most frequently used in JavaScript libraries today. It is generally the fastest implementation of the bunch only when working with short strings which don't include leading or trailing whitespace. This minor advantage is due in part to the initial-character discrimination optimization it triggers. While this is a relatively decent performer, it's slower than the three methods above when working with longer strings, because the top-level alternation prevents a number of optimizations which could otherwise kick in.str = str.match(/\S+(?:\s+\S+)*/);
return str ? str[0] : '';
This is generally the fastest method when working with empty or whitespace-only strings, due to the pre-check of required character optimization it triggers. Note: In IE6, this can be quite slow when working with longer strings.return str.replace(/^\s*(\S*(\s+\S+)*)\s*$/, '$1');
This is a relatively common approach, popularized in part by some leading JavaScripters. It's similar in approach (but inferior) totrim8
. There's no good reason to use this in JavaScript, especially since it can be very slow in IE6.return str.replace(/^\s*(\S*(?:\s+\S+)*)\s*$/, '$1');
The same astrim6
, but a bit faster due to the use of a non-capturing group (which doesn't work in IE 5.0 and lower). Again, this can be slow in IE6.return str.replace(/^\s*((?:[\S\s]*\S)?)\s*$/, '$1');
This uses a simple, single-pass, greedy approach. In IE6, this is crazy fast! The performance difference indicates that IE has superior optimization for quantification of "any character" tokens.return str.replace(/^\s*([\S\s]*?)\s*$/, '$1');
This is generally the fastest with very short strings which contain both non-space characters and edge whitespace. This minor advantage is due to the simple, single-pass, lazy approach it uses. Liketrim8
, this is significantly faster in IE6 than Firefox 2.
Since I've seen the following additional implementation in one library, I'll include it here as a warning:
return str.replace(/^\s*([\S\s]*)\b\s*$/, '$1');
Although the above is sometimes the fastest method when working with short strings which contain both non-space characters and edge whitespace, it performs very poorly with long strings which contain numerous word boundaries, and it's terrible (!) with long strings comprised of nothing but whitespace, since that triggers an exponentially increasing amount of backtracking. Do not use.
A different endgame
There are two methods in the table at the top of this post which haven't been covered yet. For those, I've used a non-regex and hybrid approach.
After comparing and analyzing all of the above, I wondered how an implementation which used no regular expressions would perform. Here's what I tried:
function trim10 (str) { var whitespace = ' \n\r\t\f\x0b\xa0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000'; for (var i = 0; i < str.length; i++) { if (whitespace.indexOf(str.charAt(i)) === -1) { str = str.substring(i); break; } } for (i = str.length - 1; i >= 0; i--) { if (whitespace.indexOf(str.charAt(i)) === -1) { str = str.substring(0, i + 1); break; } } return whitespace.indexOf(str.charAt(0)) === -1 ? str : ''; }
How does that perform? Well, with long strings which do not contain excessive leading or trailing whitespace, it blows away the competition (except against trim1
/2
/8
in IE, which are already insanely fast there).
Does that mean regular expressions are slow in Firefox? No, not at all. The issue here is that although regexes are very well suited for trimming leading whitespace, apart from the .NET library (which offers a somewhat-mysterious "backwards matching" mode), they don't really provide a method to jump to the end of a string without even considering previous characters. However, the non-regex-reliant trim10
function does just that, with the second loop working backwards from the end of the string until it finds a non-whitespace character.
Knowing that, what if we created a hybrid implementation which combined a regex's universal efficiency at trimming leading whitespace with the alternative method's speed at removing trailing characters?
function trim11 (str) { str = str.replace(/^\s+/, ''); for (var i = str.length - 1; i >= 0; i--) { if (/\S/.test(str.charAt(i))) { str = str.substring(0, i + 1); break; } } return str; }
Although the above is a bit slower than trim10
with some strings, it uses significantly less code and is still lightning fast. Plus, with strings which contain a lot of leading whitespace (which includes strings comprised of nothing but whitespace), it's much faster than trim10
.
In conclusion…
Since the differences between the implementations cross-browser and when used with different data are both complex and nuanced (none of them are faster than all the others with any data you can throw at it), here are my general recommendations for a trim
method:
- Use
trim1
if you want a general-purpose implementation which is fast cross-browser. - Use
trim11
if you want to handle long strings exceptionally fast in all browsers.
To test all of the above implementations for yourself, try my very rudimentary benchmarking page. Background processing can cause the results to be severely skewed, so run the test a number of times (regardless of how many iterations you specify) and only consider the fastest results (since averaging the cost of background interference is not very enlightening).
As a final note, although some people like to cache regular expressions (e.g. using global variables) so they can be used repeatedly without recompilation, IMO this does not make much sense for a trim
method. All of the above regexes are so simple that they typically take no more than a nanosecond to compile. Additionally, some browsers automatically cache the most recently used regexes, so a typical loop which uses trim
and doesn't contain a bunch of other regexes might not encounter recompilation anyway.
Edit (2008-02-04): Shortly after posting this I realized trim10
/11
could be better written. Several people have also posted improved versions in the comments. Here's what I use now, which takes the trim11
-style hybrid approach:
function trim12 (str) { var str = str.replace(/^\s\s*/, ''), ws = /\s/, i = str.length; while (ws.test(str.charAt(--i))); return str.slice(0, i + 1); }
New library: Are you a JavaScript regex master, or want to be? Then you need my fancy XRegExp library. It adds new regex syntax (including named capture and Unicode properties); s
, x
, and n
flags; powerful regex utils; and it fixes pesky browser inconsistencies. Check it out!
Comment by Shady on 8 June 2007:
Great work Steve. Grats on blowing the pants off the competition. Parsing the Magna Carta 20 times in less than a millisecond is flagrantly badass indeed.
Pingback by Pruebas de rendimiento a Trim() en javascript | aNieto2K on 9 June 2007:
[…] 11 funciones trim() puestas a prueba dan como resultado una interesante estadÃstica que nos permite seleccionar la que más nos interese. […]
Comment by Steve on 9 June 2007:
Thanks, Shady. But really, the competition is regular expression engines themselves and how to best take advantage of them. I doubt many JavaScripters who’ve written a trim function have spent much time considering alternative approaches.
Comment by SeriousSam on 10 June 2007:
After looking at your comparison of trim methods (which in my oppinion is quite interesting) I found minor bugs in your trim10 and trim11 methods. It seems that your array indices and arguments to substr are off by 1. Consider this, for a hopefully correct trim11:
function trim( str ) {
str = str.replace(/^\s+/, ”);
for( var i = str.length-1; i > 0; –i ) {
if( /\S/.test( str[i] ) ) {
str = str.substring( 0, i+1 );
break;
}
}
return str;
}
Comment by Steve on 10 June 2007:
Fixed. Thanks.
By the way, I’d recommend avoiding treating strings as arrays to look up character indexes like that (e.g.,
str[i]
). It’s not part of the ECMA-262 3rd Edition standard and doesn’t work correctly in IE. Better to stick with thecharAt
method for accurate lookups.Pingback by All in a days work… on 12 June 2007:
[…] “Trim” the Holdup What if we created a hybrid implementation which combined a regex’s universal efficiency at trimming leading whitespace with the alternative method’s speed at removing trailing characters? (tags: RegEx JavaScript) […]
Comment by Doeke Zanstra on 6 July 2007:
Cool! Great work. This kind of research should be done more often.
I found this article, because Dean Edwards used the research for the base2 project (http://code.google.com/p/base2/). I actually did some tests on the Mac and made even another implementation, see my website.
Comment by Steve on 6 July 2007:
Thanks, Doeke! And thanks for mentioning that Dean Edwards referenced this in Base2, as I hadn’t known that previously. (By the way, through referrer logs I’ve so far discovered this post also referenced in implementations by or discussions with the developers of jQuery, Ext, Rails, and CFJS.)
For other readers, see Doeke Zanstra’s blog post for performance times of the above trim functions in several WebCore, Gecko, and Presto based browsers on Mac OS, and a better version of trim10.
Pingback by Dylan Schiemann » Blog Archive » a better way to trim on 9 July 2007:
[…] Levithan has put together by far the most extensive analysis of JavaScript and trim I have seen (thanks for the tip Dean). We’ll certainly use this knowledge in Dojo 0.9 and […]
Comment by Karl on 9 July 2007:
Just wanted to let you know Steve, that since being pointed at your blog by Dean Edwards, that its been added to the list of RSS feeds for our News Aggregator (Planet Dojo) at http://dojotoolkit.org
Great article and very interesting articles all around on your blog. Keep up the great work!
-Karl
Comment by Steve on 9 July 2007:
Karl, that’s awesome! Thanks for the heads up.
Comment by Jeff on 12 July 2007:
Interesting. My firefox actually ran trim10 the fastest. Constantly return near 0ms times. A bug perhaps, or was it just that speedy?
If I cranked it up to 60 iterations, it would show 10ms.
Regardless, very nice work.
Comment by Steve on 12 July 2007:
Jeff, it’s just that speedy. 🙂 With the given test of trimming the Magna Carta with small amounts of whitespace at both ends, trim10 certainly should be the fastest of those shown here.
But as this blog post explains in some detail, none of these are the one-true fastest. Their comparative speeds depend largely on the data they’re fed, although some are more consistent than others cross-browser and/or with any given data.
Trim10’s strength is that, apart from edge whitespace, the length of the string has very little impact on its performance. Its weakness is strings which contain large amounts of leading or trailing whitespace, since a loop over each character won’t traverse that nearly as fast as a simple regex. That’s why I recommended trim11 over trim10, as although it’s a little slower than trim10 with many types of strings, it removes one of trim10’s weaknesses vs. the regex-based functions (long, leading whitespace).
Comment by Jeff on 20 July 2007:
True, but in all honesty, I’ve never really run into cases in the wild where the trimming that needed to be done was that out of hand.
Sure, if I make up data to be cleaned and stuff the snot outta it with spaces, it pokes a hole in the looping method — but I’ve never encountered that IRL.
So for me, I’ll happily gank trim10 😀 Thanks!
Comment by Robbert Broersma on 3 November 2007:
Shortest notation:
function trim(str)
{
str = str.replace(/^\s+/, ”)
for (var i = str.length; i–;)
if (/\S/.test(str.charAt(i)))
return str.substring(0, ++i)
return str
}
Comment by Ariel Flesler on 8 November 2007:
Hi Steven, great article!
I gave it a try and wrote my own trim.. if you are interested, here’s an implementation, I tried to make it work as fast as yours.
In my PC it is sometimes a bit faster, sometimes equal.
I created a 27k long string, with 300 spaces on each side, using less than that, both were always giving 0ms….
Here’s the example: http://www.freewebs.com/flesler/Trim/
And here’s the piece of code:
var trim = (function(){ \u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000';
var ws = {},
chars = ' \n\r\t\v\f\u00a0\u2000\u2001\u2002\u2003\u2004\u2005
for(var i = 0; i < chars.length; i++ )
ws[chars.charAt(i)] = true;
return function( str ){
var s = -1,
e = str.length;
while( ws[str.charAt(--e)] );
while( s++ !== e && ws[str.charAt(s)] );
return str.substring( s, e+1 );
};
})();
Comment by Steve on 8 November 2007:
@Ariel: Nicely done. 🙂
Comment by Ariel Flesler on 9 November 2007:
Thanks! and thank you for clearing up my mess 😀
Comment by Nacho on 7 December 2007:
Hi,
Good thinking and nice research!
I was curious about already existing trim implementations and your article sums them up very nicely. Still, playing myself a little bit and being the regex lover that I am, I came up with the simpliest implementation that I could think off and I was wondering what you think about it.
String.prototype.trim = function() {
return this.replace(/^\s*(\S.*\S)?\s*$/, ‘$1’);
}
I haven’t test for speed or browser compatibility issues (I have just to support Firefox 2 & IE7 at the time being) and I’m not *very* concerned about a couple of ms difference so it could very well not be an alternative at all. Still, I’d love to hear what you have to say about it 😀
Comment by Steve on 7 December 2007:
Thanks, Nacho! Re: your implementation, it is not equivalent to the others shown (in other words it’s broken). Two reasons:
– It will not work with strings which have just one non-whitespace character.
– Since JavaScript doesn’t have a “single-line” mode you need to replace the dot with
[\S\s]
(or similar).In any case,
/^\s+|\s+$/g
is already more “simple” if you measure by readability (arguably) or number of characters.Comment by Nacho on 9 December 2007:
You completely got me there. I’m glad I posted it 🙂
It’s of course true, it wouldn’t validate a one non-whitespace character string. Silly that it could scape my attention.
About the “single line” mode I’m not so sure I follow you. I thought JavaScript had a flag (m) for multiline matching and therefore assumed that when the flag is not present matching only takes place on a single line strings … then again I do not know the internals as far as you do …
I guess I’ll have to give 1 or 11 a go 😉
Thanks for the comment!
Comment by Steve on 9 December 2007:
The regex terms single-line and multi-line are confusing for many people, which is why I shun them in RegexPal in favor of the more descriptive “^$ match at line breaks” (instead of “multi-line”) and “dot matches newline” (instead of “single-line”). However, “dot matches newline” mode is not available natively in JavaScript… it’s provided by XRegExp.
In other words, without XRegExp, JavaScript provides no way for the regex dot to match all characters including newlines. Multi-line mode changes the meaning of the “^” and “$” tokens — it has nothing to do with what the dot matches.
Pingback by Now Direction » JavaScript Trim Function on 12 December 2007:
[…] know my audience doesn’t care much for Web Development content, but this was a fascinating article about the various methods of trimming the white space from a text […]
Pingback by jQuery Minuteâ„¢ » Performance Tuning Regular Expressions in JavaScript on 18 January 2008:
[…] Doug D started a thread on the google groups jQuery developer list about a faster trim method he ran across and how counter intuitively it was better to run two expressions rather than one. Matt Kruse then provided a link to a great article on the subject: https://blog.stevenlevithan.com/archives/faster-trim-javascript […]
Comment by Scott Trenda on 20 January 2008:
Hey Steve, excellent analysis. Thought I’d throw in my two cents here; after slogging through it, I saw Ariel already posted a near-verbatim version of the trim10 redux I’d written. 🙂
I did a bit of extensive testing based on your test example using variations of #10 and #11. Like you said, while #10 is nice and zippy when there’s no leading space, it’s just too slow when there’s any significant leading space. So I went with #11, and here’s what I ended up with. (Brevity first, as always! 🙂 )
function trim13 (str)
{
var str = str.replace(/^\s*/, “”), s = /\s/, i = str.length;
while (s.test(str.charAt(–i)));
return str.substring(0, i + 1);
}
Just putting the /\S/ regex (from #11) outside of the loop sped it up for starters, about twice as fast on runs of 10000 times. A notable quirk I found with the /^\s+/ regex: it performs much worse (3400ms vs. 470ms) on strings with no leading whitespace, compared to strings with even a single leading space. Perhaps you could explain that better, but it seems /^\s*/ keeps a consistent performance in both cases.
On a different note, we seem to share the same taste in alcohol. 🙂 Vodka + Red Bull is my mainstay at any bar, but my vodka of choice happens to be Grey Goose. And I drink entirely too much Red Bull – 4 to 6 cans on a usual workday. Keeps you running, no? ~_^
Comment by Scott Trenda on 20 January 2008:
Oh, and one more thing about the trim10 function as you have it posted now. IE doesn’t recognize ‘\v’ as a metacharacter in Javascript strings, so a literal ‘v’ ends up in the whitespace string. Try trimming a string starting with ‘v’ – it’ll chop the leading ‘v’ as well. (That one drove me nuts for a full 20 minutes.) Replace it with \x0b and all’s well. ^_^
Comment by Steve on 20 January 2008:
@Scott Trenda, interesting about IE not interpreting
\v
as a vertical tab when embedded in a string literal, especially since it is handled correctly in regexes (/\v/.test("\x0b") == true
). I’ll fix that intrim10
.I’ve actually been using something very similar to your
trim13
recently. The only (edge case) problem is that browsers interpret\s
differently (see JavaScript, Regex, and Unicode, and the test page). I’ve left the very quick and dirtytrim10
/11
up there since their ugliness seems to have inspired others to improve them.As for your observations about
^\s*
vs.^\s+
, that depends on the implementation. Another alternative to consider is^\s\s*
. The difference stems from internal optimizations like whether or not a pre-check of required characters is performed, and the relative cost of success vs. failure to match.Pingback by Ajaxian » JavaScript Trim Optimizations on 3 February 2008:
[…] Simon found this gem. Steven Levithan wrote about optimizing a JavaScript trim. […]
Comment by Mark on 3 February 2008:
The method that GWT uses to translate String.trim():
public native String trim() /*-{
var r1 = this.replace(/^(\s*)/, ”);
var r2 = r1.replace(/\s*$/, ”);
return r2;
}-*/;
Comment by Alexey on 4 February 2008:
Is really actual on “MS Windows” and “Internet Explorer”? you cannot fix bugs by javascript.
Comment by Daniel Steigerwald on 4 February 2008:
Hi,
nice work. But one thing I’m missing is point to different /s handling in browsers.
ECMAScript specifies \s as [\t\n\v\f\r], Firefox added [\u00A0\u2028\u2029] to the list.
Opera happens to match with \s like Firefox. Safari behaves like the IE and doesnt match   with \s.
more: http://dev.mootools.net/ticket/646
I didnt test \u2028 and \u2029, but I think /[\s\u00A0\u2028\u2029]+/g should fix this for all browsers, as it adds firefox`s additions to \s.
Comment by Steve on 4 February 2008:
@Daniel Steigerwald, that’s not entirely correct. See my post on JavaScript, regex, and Unicode for more information about what
\s
should and does match.Pingback by Javascript News » JavaScript Trim Optimizations on 4 February 2008:
[…] Simon found this gem. Steven Levithan wrote about optimizing a JavaScript trim. […]
Comment by Tiziano on 4 February 2008:
ERROR in function 10 and 11:
for (var i = str.length – 1; i > 0; i–) {
=> for (var i = str.length – 1; i >= 0; i–) {
Comment by Steve on 4 February 2008:
@Tiziano, that is not an error. The way they are written, there is no reason for the backwards loops to be concerned about the character at index 0.
Comment by Aristotle Pagaltzis on 4 February 2008:
It seems to me that your loop does far more work than necessary. Why shorten the string one character at a time? Keep the loop counter around and you can do all the shortening at once. That’s bound to be faster.
var i = str.length - 1;
while ( i >= 0 && /\s/.test(str.charAt(i)) ) --i;
str = str.substring(0, i + 1);
Also, it might be worth pulling the regex construction out of the loop; that depends on how well Javascript compilers optimise.
var ws = /\s/;
while ( i >= 0 && ws.test(str.charAt(i)) ) --i;
Comment by Aristotle Pagaltzis on 4 February 2008:
True, but anyone who ever modifies the code needs to be aware that there’s an inactive bug in the code, lest they accidentally activate it. And it doesn’t cost anything at all to make the check correct. So there’s no reason not to fix it.
Comment by Steve on 4 February 2008:
@Aristotle Pagaltzis, putting the regex outside the loop shouldn’t matter according to ECMA-262 3rd Edition since the spec states that regex literals cause only one object to be created at runtime for a script or function. However, most implementations don’t respect that (Firefox does), and in any case the behavior is proposed to be changed in ECMAScript 4.
Regarding changing the loop counter to allow an extra iteration… I’ve realized that Tiziano was correct. It was in fact an error in the case where there is whitespace to the right and only one non-whitespace character. I’ve fixed it, but the trim10/11 implementations are ugly anyway, as you’ve pointed out. Although I’ve intentionally been avoiding this for some time, I’ve gone ahead and edited the post to show a cleaner version of the
trim11
approach at the end (which is nearly identical to what Scott Trenda posted earlier).Pingback by A faster JavaScript Trim | foojam.com on 4 February 2008:
[…] is an older article, but Steven Levithan has an article posted on his site regarding how to do a faster trim function in JavaScript. His demo page has eleven different trim implementations and some example text to test them out […]
Comment by Yves on 4 February 2008:
Did anybody tried str.lastIndexOf() to find the trailing blanks ?
Pingback by SiNi Daily » JavaScript Trim Optimizations on 4 February 2008:
[…] JavaScript Trim Optimizations February 4, 2008 – 2:32 pm | by SiNi Simon found this gem. Steven Levithan wrote about optimizing a JavaScript trim. […]
Comment by Aristotle Pagaltzis on 4 February 2008:
@Yves:
You can’t use
lastIndexOf
for this problem. That method only gives you a way to ask for the last appearance of a specific character, but what we need is a way to ask for the last appearance of any other character than a space. Additionally,\s
in a regex doesn’t find just space characters, but a number of other whitespace characters as well. You can’t do that withlastIndexOf
at all.@Steve:
Now that I think of it, how does the following version fare?
return str.replace(/^\s+/, '').replace(/.*\s+$/, '');
This should be faster than any of the regex approaches you showed above. A class like
[\s\S]
is kinda silly – “match anything that’s whitespace or is not whitespace” is just a long-winded way to say “match anything,” except that non-IE browsers should be able to optimise it as well, and in fact many regex engines have special optimisations for.*
built in. This should gobble up the entire string immediately and then do the same backtrack-loop as the explicit Javascript code does, except without crossing back and forth between the JS VM and the RE engine at every backtracking step in order to involve the JS VM dispatcher.But that’s just theory-based hypothesis – benchmarking is in order to confirm (or disprove) it.
Comment by Aristotle Pagaltzis on 4 February 2008:
Oh! D’uh. Disregard the above suggestion. That won’t work for obvious reasons.
I got confused because I do this in conjunction with
\zs
in Vim all the time. In Vim you could writes/.*\zs\s\s*$//
and it would replace just the part after the\zs
. In Perl 5.10 you can do the same using the\K
escape. But Javascript has neither extension, so… yeah.Comment by Steve on 4 February 2008:
\K
would be very nice to have, especially since JavaScript has no lookbehind. But yeah, you can’t do that. Note that something like[\S\s]
is necessary because JavaScript has no “single-line” (dot matches all) mode.Comment by Aristotle Pagaltzis on 4 February 2008:
OK, I think I’ve unbrainfarted myself enough to actually try my idea of using a greedy match and RE engine backtracking. Sorry for all the noise. Here’s
trim12
:function trim12 (str) {
var str = str.replace(/^\s\s*/, ''),
len = str.length;
if (len && /\s/.test(str.charAt(len-1)) {
var re = /.*\S/g;
re.test(str);
str = str.slice(0, re.lastIndex);
}
return str;
}
The trick here is as follows. The inner regex is run only if the string is non-empty, which means there must be non-whitespace characters in it, because otherwise the first substitution would have left an empty string, and only if the last character is whitespace. In that case, we run a global match that first gobbles up the entire string using
.*
, then backtracks until it can match\S
. We know it must match because at this point we know the string ends with whitespace and we know it has non-whitespace characters in it. After the match, because it is global (/g
flag), the position of the character after the end of the match will be recorded in thelastIndex
property of the regex object.So we just use that to return the portion of the string before it.
Please benchmark this. I’ve tested it and I know it works; now the question is how fast it is.
Comment by Aristotle Pagaltzis on 4 February 2008:
Argh, now I see that. I guess explicitness would demand
[-\uFFFF]
, but that’s clearly more cumbersome to type and read than[\s\S]
. Sigh.(Hopefully I will stop spamming your comments now. Sorry again.)
Comment by jag on 4 February 2008:
Is there any performance gain from using /\s*\s$/ instead of /\s+$/ or /\s*$/?
Pingback by links for 2008-02-05 « Simply… A User on 4 February 2008:
[…] Faster JavaScript Trim Since JavaScript doesn’t include a trim method natively, it’s included by countless JavaScript libraries — usually as a global function or appended to String.prototype. (tags: javascript performance trim regex string optimization regexp tips **) […]
Pingback by afongen » links for 2008-02-05 on 5 February 2008:
[…] Faster JavaScript Trim (tags: javascript regex) […]
Pingback by Trim in Javascript / Melodycode.com - Life is a flash on 5 February 2008:
[…] per la visita!Steven Levithan ha sentito la necessità (ebbravo!) di studiare come sia possibile ottimizzare la funzione trim che di solito viene inclusa come libreria esterna in Javascript. La conclusione è la seguente (vi […]
Comment by Haoest on 5 February 2008:
Funny, IE beats Firefox most of the rounds. If only they have spent as much effort on adhering to the standard…
Pingback by benstraw.com » links for 2008-02-05 on 5 February 2008:
[…] https://blog.stevenlevithan.com/archives/faster-trim-javascript (tags: benchmark development javascript optimization trim regex string) […]
Comment by Scott Blum on 5 February 2008:
Steve, I have a couple of questions about the 2008-02-04 update:
1) What happens in the empty string case? Seems like “i” is -1 and bad stuff should happen?
2) What’s the licensing around this? I’d like to use it in GWT (we’re Apache 2.0).
Thanks,
Scott
Comment by Steve on 5 February 2008:
@Scott Blum:
1. An
i
value of-1
works fine becausecharAt
returns an empty string if the provided index is out of range.2. Any code I post on my blog is under the MIT license unless otherwise specified. But something this simple I’d consider public domain. You’re certainly very welcome to use it in GWT.
Comment by Joonas Lehtolahti on 6 February 2008:
Hello, nice article. Using the benchmark page with Opera 9.5 beta snapshot build 9755 on Windows, I got some interesting results.
With 20 iterations the methods 1-8 were between 150 and 450 ms, trim9 took almost 4 seconds! trim10 and trim12 were 0 ms every time, but trim11 was at 15 ms.
With 100 iteration the methods 1-8 were quite similar as with 20 iterations, but the values five times larger naturally. Similarly trim9 was almost 19 seconds (slow!). Still trim10 gave a nice result of 0 ms, while both trim11 and trim12 were 47 ms.
With 200 iterations I finally managed to make trim10 take more than 0 ms of time, it took 15 ms in one run, which itself took almost a minute to run, thanks to trim9’s slow 39 seconds of execution time! Here, however, trim12 was slower than trim11 for some reason, while with 20 iterations it was always a lot faster.
So, at least with Opera 9.5 it seems like the method 10 is clearly superior to all of the other methods.
Comment by Steve on 6 February 2008:
@Joonas Lehtolahti, that may be true with the provided test data, but this post makes it clear that none of the methods are fastest with all types of strings.
Trim9 is slow in Opera because that browser is particularly bad at lazy quantification.
Comment by Chris Akers on 21 February 2008:
It seems to me that the right trim of #1 is a little out of order. The current #1 trim shows:
str.replace(/^\s\s*/, ”).replace(/\s\s*$/, ”);
Shouldn’t it be changed to:
str.replace(/^\s\s*/, ”).replace(/\s*\s$/, ”);
It is symmetric with the left trim and it would seem that a pre-check optimization could potentially be more easily performed by using only the last character. [I admit that I might be over-thinking this…]
Comment by Steve on 22 February 2008:
@Chris Akers, you are assuming that such a pre-check would account for position relative to particular anchors within the string. That is unlikely. But more importantly, you would potentially be making the second regex take longer to fail at positions prior to string-ending whitespace. That tiny bit of extra time (assuming the absence of certain types of other potential internal optimizations) would add up over the course of checking every position in a long string.
Pingback by News » Faster JavaScript Trim on 23 February 2008:
[…] Faster JavaScript Trim. Neat optimisation post–it turns out that while regular expressions are great for removing leading whitespace you can do a lot better at trailing whitespace by manually looping backwards from the end of the string. […]
Comment by Sean on 24 February 2008:
What about something like this?
String.prototype.trim = function(){
var s = /\s*([\S+\s*]*\S+)+\s*/i.exec(this);
return (!s) ? ” : s[1];
}
Pingback by Well Read & Misinformed» String.prototype Fun! on 24 February 2008:
[…] note: I found this nifty site that compares a variety of String.trim methods ( I wonder how mine compares ), it’s worth reading : Faster JavaScript Trim […]
Comment by Steve on 26 February 2008:
@Sean, I think you fundamentally misunderstand how your regular expression works. For starters, you should read up on regex character class syntax. With all whitespace, your regular expression manages to be orders of magnitude worse performing than every other regular expression on this page, including the one that was so bad I had to leave it out of the speed tests.
Comment by Luca Guidi on 28 May 2008:
function trim13 (str) {
var ws = /\s/, _start = 0, end = str.length;
while(ws.test(str.charAt(_start++)));
while(ws.test(str.charAt(–end)));
return str.slice(_start – 1, end + 1);
}
You can find the benchmark results on my related post:
http://lucaguidi.com/2008/5/28/faster-javascript-trim
Comment by Steven Levithan on 28 May 2008:
@Luca Guidi, very nice list of browser results. However, the copy of my benchmark page you’re providing changes the test in a subtle but important way — it removes all the leading whitespace, and all but the very last trailing whitespace character. That means the
test
methods in yourwhile
loops run fewer times.The more leading whitespace you have, the slower that
test
ing each character will be when compared to removing it all in one shot withreplace
. It’s not accurate to claim (without qualifiers) that your implementation is “the fastest one,” because that doesn’t hold true with certain types of subject data. According to my brief test in Firefox 2.0.0.14, with enough leading whitespace your implementation becomes much slower than all the others.Comment by Luca Guidi on 28 May 2008:
@Steve, I’m very sorry for this, I just downloaded your page and added my version of the trim function.
I didn’t noticed the missing whitespaces, maybe the browser stripped them.
I’ll run again all the test suite, then update the results.
Thanks for the hint!
Comment by Luca Guidi on 29 May 2008:
I updated test results on my post.
Comment by Alejandro Lapeyre on 28 June 2008:
I liked trim10!
I was going to mention about the i >=0 en the second loop.
There is no bug in i > 0, it is how the algorithm was thought. You are now checking twice for character (0).
Thanks for this beautiful page!
Comment by Ryan Nauman on 23 July 2008:
I’ve found that your hybrid solution, along with a few of the others, is flawed in IE6. A single space
' '
will not be trimmed in IE6 but will in FF3. I’m a little stumped right now so I don’t have any fixes for you. I can’t see why the first replace,str.replace(/^\s\s*/, '')
isn’t picking it up in IE6Comment by Ryan Nauman on 23 July 2008:
I lied. It isnt a single space thats the problem it’s because I have the html entity,
Pingback by O2 Blog » Les chaînes de caractères on 4 August 2008:
[…] alors simplifions cette tâche une bonne fois pour toute. La méthode utilisée a été prise dans Flagrant Badassery. String.prototype.trim = function ( str ) { return str.replace(/^ss*/, […]
Comment by GOVO on 1 September 2008:
Hi, I have build a page to test the methods above, just for fun, I hope you’ll like it. Thanks.
http://guitarbean.com/topic/javascript-trim/
Comment by Phillippe on 7 September 2008:
Hi, folks!
This is a very good work, thanks for sharing! I’ve just tweaked it a little, so that you can call the trim function as if it were a member of String. You just have to add this:
String.prototype.trim = function()
{
var str = this;
str = str.replace(/^\s\s*/, ”),
ws = /\s/,
i = str.length;
while (ws.test(str.charAt(–i)));
return str.slice(0, i + 1);
}
This way, you can do like:
var myString = ” bla bla bla “;
alert(myString.trim());
Comment by bryanayrb on 17 October 2008:
your like a hacker personality steve..
phillipe -> your benchmark is great..
Comment by blr on 21 October 2008:
Can you try this method :
function trim(str) {
return str.replace(/^\s*(\b.*\b|)\s*$/, “$1”);
}
I found this method in Rialto framework.
Thanks,
Comment by Michael Lee Finney on 2 November 2008:
I tested my version of trim() and found that it is as much as 3 times faster than the code here.
String.whiteSpace =
{
“\u0009” : true, “\u000a” : true, “\u000b” : true, “\u000c” : true, “\u000d” : true, “\u0020” : true, “\u0085” : true,
“\u00a0” : true, “\u1680” : true, “\u180e” : true, “\u2000” : true, “\u2001” : true, “\u2002” : true, “\u2003” : true,
“\u2004” : true, “\u2005” : true, “\u2006” : true, “\u2007” : true, “\u2008” : true, “\u2009” : true, “\u200a” : true,
“\u200b” : true, “\u2028” : true, “\u2029” : true, “\u202f” : true, “\u205f” : true, “\u3000” : true
};
/*
* Trim spaces from a string on the left and right.
*/
trim13 = function(str)
{
str = String(str);
var n = str.length;
var s;
var i;
if (!n)
return str;
s = String.whiteSpace;
i = 0;
if (n && s[str.charAt(n-1)])
{
do
{
–n;
}
while (n && s[str.charAt(n-1)]);
if (n && s[str.charAt(0)])
do
{
++i;
}
while (i < n && s[str.charAt(i)]);
return str.substring(i, n);
}
if (n && s[str.charAt(0)])
{
do
{
++i;
}
while (i < n && s[str.charAt(i)]);
return str.substring(i, n);
}
return str;
};
Comment by Michael Lee Finney on 2 November 2008:
I have significantly improved the previous version. I also added tests for strings with no leading / trailing spaces because it is extremely common to trim strings that do not need triming. The improved version is faster than trim1 – trim12 on all tested browsers (chrome, ie 6,7,8, ff 2,3,3b, opera 9.62, safari 3.1.2) for all test cases except that FF 3x browsers sometimes are slightly faster with a regexp when spaces have to be trimmed. However, it is not always the SAME regexp. In many cases, the speedup is as much as a factor of 20 over the regexp methods.
String.whiteSpace = [];
String.whiteSpace[0x0009] = true;
String.whiteSpace[0x000a] = true;
String.whiteSpace[0x000b] = true;
String.whiteSpace[0x000c] = true;
String.whiteSpace[0x000d] = true;
String.whiteSpace[0x0020] = true;
String.whiteSpace[0x0085] = true;
String.whiteSpace[0x00a0] = true;
String.whiteSpace[0x1680] = true;
String.whiteSpace[0x180e] = true;
String.whiteSpace[0x2000] = true;
String.whiteSpace[0x2001] = true;
String.whiteSpace[0x2002] = true;
String.whiteSpace[0x2003] = true;
String.whiteSpace[0x2004] = true;
String.whiteSpace[0x2005] = true;
String.whiteSpace[0x2006] = true;
String.whiteSpace[0x2007] = true;
String.whiteSpace[0x2008] = true;
String.whiteSpace[0x2009] = true;
String.whiteSpace[0x200a] = true;
String.whiteSpace[0x200b] = true;
String.whiteSpace[0x2028] = true;
String.whiteSpace[0x2029] = true;
String.whiteSpace[0x202f] = true;
String.whiteSpace[0x205f] = true;
String.whiteSpace[0x3000] = true;
/*
* Trim spaces from a string on the left and right.
*/
trim13 = function(str)
{
var n = str.length;
var s;
var i;
if (!n)
return str;
s = String.whiteSpace;
if (n && s[str.charCodeAt(n-1)])
{
do
{
–n;
}
while (n && s[str.charCodeAt(n-1)]);
if (n && s[str.charCodeAt(0)])
{
i = 1;
while (i < n && s[str.charCodeAt(i)])
++i;
}
return str.substring(i, n);
}
if (n && s[str.charCodeAt(0)])
{
i = 1;
while (i < n && s[str.charAt(i)])
++i;
return str.substring(i, n);
}
return str;
};
Comment by Ariel Flesler on 3 November 2008:
I posted a follow up on this good ol’ thread:
http://flesler.blogspot.com/2008/11/fast-trim-function-for-javascript.html
Pingback by Javascript string trim function implementations and comparison « Hao’s Blog on 9 November 2008:
[…] Javascript string trim function implementations and comparison https://blog.stevenlevithan.com/archives/faster-trim-javascript […]
Comment by blr on 25 November 2008:
Hi,
Do you try the trim function of FireFox 3.1 ?
it’s non standard function …
https://developer.mozilla.org/en/Firefox_3.1_for_developers
in the Javascript section.
Comment by Devis on 5 December 2008:
Impressive! Good work Steve
Pingback by Trim functions | keyongtech on 18 January 2009:
[…] functions There is an interseting examination of varions trim functions here: <URL: https://blog.stevenlevithan.com/archi…rim-javascript > The best all-round function seems to be trim3 when strings have lots of leading and trailing […]
Comment by karol on 29 January 2009:
Because of the regexp internals, the fastest trim is /^[^\S]+/ and /^[^\S]+$/, trust me.
Comment by karol on 29 January 2009:
The second should be of course /[^\S]+$/. They both work in O(n), no wasteful backtracking.
Pingback by » trim javascript faster than the Magna Carta | breaker of stuff, destroyer of things on 5 February 2009:
[…] blog.stevenlevithan.com/archives/faster-trim-javascript Since JavaScript doesn’t include a trim method natively, it’s included by countless […]
Comment by Kristhian on 10 March 2009:
Thanks man! greate work!
Comment by Tom on 27 March 2009:
function trimstr(str)
{
var whitespace= " \n\r\t\f";
for( var i= 0; i < str.length; i++ )
if( whitespace.indexOf( str.charAt(i) ) < 0 )
break;
for( var j= str.length – 1; j >= i; j– )
if( whitespace.indexOf( str.charAt(j) ) < 0 )
break;
return str.substring( i,j+1 );
}
Pingback by John Resig - ECMAScript 5 Strict Mode, JSON, and More on 22 May 2009:
[…] Levithan has discussed the trim method in great […]
Pingback by Extremely fast String.prototype.trim() implementation in JavaScript. « o.O on 30 July 2009:
[…] are well-crafted, but I think plain old loops can do better. Back in 2007, Steve Levithan covered some really fast implementations of trim() followed by a few more articles by others, including Luca […]
Comment by Yesudeep Mangalapilly on 30 July 2009:
Here’s an even faster implementation of trim() based on Michael Finney’s lookup table.
function trim(str){
var len = str.length;
if (len){
var whiteSpace = String.whiteSpace;
while (whiteSpace[str.charCodeAt(–len)]);
if (++len){
var i = 0;
while (whiteSpace[str.charCodeAt(i)]){ ++i; }
}
str = str.substring(i, len);
}
return str;
}
Pingback by Benjamin A. Shelton | Blog » Blog Archive » Links: August 5th on 5 August 2009:
[…] Levithan (what is it with cool-sounding names today?) covered in 2007 several issues relating to various trim() implementations using regex in JavaScript, why they work well (or […]
Comment by Deepak on 26 August 2009:
WOW thats a nice piece of info… i used the first one… and its working fine… thanks for the snippet…
Comment by Oz on 10 September 2009:
Thanks for the 411! I’m using the 1st on my project.
Pingback by Fast Trim Function for Javascript W3C Tag on 20 October 2009:
[…] Steven Levithan’s old post about string […]
Comment by Elmer Carandang on 23 October 2009:
Steve,
I seem to have some problem with HTMLEncoding since function trim12(str) returns values like:
<img display=”image.gif” />
instead of what I expect which is:
Over other implementations, your function is working extremely well!!!
Comment by Benjamin Smith on 28 October 2009:
Trim? TRIM!?!?
Here it is, the 21st Century, and we’re talking about implementations of TRIM for g–sakes! Why isn’t such a boneheadely simple, useful tool part of the JS language itself? It’s like getting excited over the best way to implement screen – been around since what? 1970s?
Javascript is “the new way” but its shortcomings are soooo painful.
Pingback by Trimming trim by razing arrays (JavaScript) | Out Of What Box? on 4 November 2009:
[…] in 2007, Steve Levithan compared the speed of different implmentations for the missing JavaScript String.trim() function. Steve’s blog […]
Comment by Andrew Garrett on 26 November 2009:
This looks suspiciously like premature optimisation, I’m not sure how many cases there are where trim() is really performance-critical.
Comment by RobG on 16 December 2009:
Seems this won’t be needed in the future as ECMAScript Ed. 5 includes String.prototype.trim. And of course all browser vendors will rush to implement it immediately. 🙂
Otherwise, an interesting post. It may appear to be premature optimisation, however it shows that while some algorithms are very fast in some browsers, they may also be very slow in others. It’s good to use an algorithm that is reasonably fast in a representative cross-seciton.
Pingback by Elixir » Blog Archive » Javascript Trim on 24 December 2009:
[…] Credit. I made it available to the String prototype. Read […]
Comment by Dan on 3 January 2010:
I read this article some years ago and found it very useful, I will use trim12 because of shorter code and still good performance, thanks.