JavaScript split Inconsistencies & Bugs: Fixed!
The String.prototype.split method is very handy, so it's a shame that if you use a regular expression as its delimiter, the results can be so wildly different cross-browser that odds are you've just introduced bugs into your code (unless you know precisely what kind of data you're working with and are able to avoid the issues). Here's one example of other people venting about the problems. Following are the inconsistencies cross-browser when using regexes with split:
- Internet Explorer excludes almost all empty values from the resulting array (e.g., when two delimiters appear next to each other in the data, or when a delimiter appears at the start or end of the data). This doesn't make any sense to me, since IE does include empty values when using a string as the delimiter.
- Internet Explorer and Safari do not follow the ECMA standard which requires that the values of capturing parentheses be spliced into the returned array (this functionality can be useful with simple parsers, etc.).
- Firefox does not follow the ECMA standard of splicing
undefinedvalues into the returned array to represent non-participating capturing parentheses (as opposed to grouping-only parentheses and parentheses which participate in the match but capture nothing). - All of the browsers I've tested (IE, Firefox, Safari, Opera, and Swift) have their own corner-case bugs where they do not follow the ECMA
splitspec (which is actually quite complex).
The situation is so bad (particularly as a result of item one, above) that I've simply avoided using regex-based splitting in the past.
That ends now. ![]()
The following script provides a fast, uniform cross-browser implementation of String.prototype.split, and attempts to precisely follow the relevant spec (ECMA-262v3 §15.5.4.14, pp.103,104).
I've also created a (fairly quick and dirty) page where you can test the results of about 55 usages of the split method, and quickly compare your browser's results with the correct implementation.
The above screenshot (enlarge) shows the results from Safari 3.0.1 beta for Windows (I've used Safari here simply because it's pretty). (Update 2007-04-08: Here's a screenshot from Safari 3.1, showing that some of the old bugs have fixed, and some new ones have taken their place.) The pink lines in the third column highlight incorrect results from the native split method. The rightmost column shows the results of the below script. It's all green in every browser I've tested (IE 5.5 – 7, Firefox 2.0.0.4, Opera 9.21, Safari 3.0.1 beta, and Swift 0.2).
Here's the script:
/* Cross-Browser Split 0.2.1 By Steven Levithan <http://stevenlevithan.com> MIT license */ var nativeSplit = nativeSplit || String.prototype.split; String.prototype.split = function (s /* separator */, limit) { // If separator is not a regex, use the native split method if (!(s instanceof RegExp)) return nativeSplit.apply(this, arguments); /* Behavior for limit: If it's... - Undefined: No limit - NaN or zero: Return an empty array - A positive number: Use limit after dropping any decimal - A negative number: No limit - Other: Type-convert, then use the above rules */ if (limit === undefined || +limit < 0) { limit = false; } else { limit = Math.floor(+limit); if (!limit) return []; } var flags = (s.global ? "g" : "") + (s.ignoreCase ? "i" : "") + (s.multiline ? "m" : ""), s2 = new RegExp("^" + s.source + "$", flags), output = [], lastLastIndex = 0, i = 0, match; if (!s.global) s = new RegExp(s.source, "g" + flags); while ((!limit || i++ <= limit) && (match = s.exec(this))) { var zeroLengthMatch = !match[0].length; // Fix IE's infinite-loop-resistant but incorrect lastIndex if (zeroLengthMatch && s.lastIndex > match.index) s.lastIndex = match.index; // The same as s.lastIndex-- if (s.lastIndex > lastLastIndex) { // Fix browsers whose exec methods don't consistently return undefined for non-participating capturing groups if (match.length > 1) { match[0].replace(s2, function () { for (var j = 1; j < arguments.length - 2; j++) { if (arguments[j] === undefined) match[j] = undefined; } }); } output = output.concat(this.slice(lastLastIndex, match.index), (match.index === this.length ? [] : match.slice(1))); lastLastIndex = s.lastIndex; } if (zeroLengthMatch) s.lastIndex++; } return (lastLastIndex === this.length) ? (s.test("") ? output : output.concat("")) : (limit ? output : output.concat(this.slice(lastLastIndex))); };
Please let me know if you find any problems. Thanks!
Update: This script has been incorporated into my XRegExp library, which includes many other JavaScript regular expression goodies and cross-browser compatibility fixes.

Comment by carmen on 9 July 2007:
yeah so i found your page after wondering why tf a very simple regex was returning different results in Opera and Mozilla - i thought i was going insane, until finding posts like this on your blog - when i see stuff like “runs on cmucl, allegro, sbcl, LispWorks, OpenMCL”, i wonder…what did the LISP guys do that browser guys have such trouble with..
Comment by Brian on 31 July 2007:
Thanks. I used your script and it saved me a huge headache with IE not treating splits like other browswers do. This script was very well done, and I liked your validation page, also very useful.
Pingback by Grabbing Code by the 00’s on 31 July 2007:
[…] Long story short, if you’re running into problems with your split method in any browser, chances are this script fixes it […]
Comment by Marcel on 10 August 2007:
Hey this code is great, excellent job! I made some optimizations for my particular use cases because I was worried about performance using this implementation versus the native one. One thing I do often is split many (hundreds or thousands) strings with the same RegExp object and your code is reconstructing separator up to two times per split. To make this faster I added some object caching on the separator parameter so it will only reconstruct the regex the first time you split with it. Also, since cross-browser behavior with string separators is consistent I just made it use the native implementation if separator isn’t an instance of RegExp. It still passes your test page with flying colors, though I only tested Firefox 2, Safari 3, and IE6. Drop me a line if you you’d like to check out the changes and possibly absorb them into your copy.
Comment by Steve on 11 August 2007:
Marcel, I’m interested. I’d already planned to change this to use the native split method for non-regex separators if I ever got around to updating it. As for caching to avoid regex recompilation, some browsers might do that automagically, so I’d be interested in testing exactly how it affects each of the major browsers before making such a change. Finally, I believe my script might fail the test page in KHTML (as opposed to WebKit) -based browsers such as Konqueror. If that’s the case, I’d want to look into how to address that (if at all possible) before re-releasing. I’ll send you an email.
Comment by John on 16 August 2007:
Just want to say thanks for the script, it works perfectly.
Comment by Trev on 21 August 2007:
Thank you, just, thank you.
Comment by Luke on 12 September 2007:
Thank you very much for this work. This keeps my simple cross browser project simple.
Comment by Mike Cowan on 27 September 2007:
Dude,
You saved my bacon with this one! Been fighting this for a couple of days and ran across your script this morning. Fired it off and BAM! worked the first time with a RegExp that worked great in FireFox but was tanking in IE.
Thanks again.
Comment by Steve on 27 September 2007:
I’m happy to hear that this has helped you all!
I’ve just modified the script to use the native split method when non-regex separators are provided, in order to run a little faster in such cases. No other significant changes were made.
Comment by Ariel Flesler on 9 November 2007:
I’d replace:
var nativeSplit = nativeSplit || String.prototype.split;for
String.prototype._split = String.prototype._split || String.prototype.split;So you don’t pollute the window with globals..
Hope that helps
Comment by Steve on 9 November 2007:
@Ariel Flesler:
Moving the namespace pollution from the
windowobject to theString.prototypeobject (which is also available globally) makes things worse, IMO. And while you could wrap all of the code in an anonymous function to avoid adding any global variables, I think there is some benefit to keeping the native version available to other code, for testing purposes if nothing else. As for the name “_split”, I intentionally avoided that because I think it’s more likely to collide with other libraries which might do something similar.For the record, the reason I do
nativeSplit = nativeSplit || String.prototype.splitinstead of justnativeSplit = String.prototype.splitis because otherwise, running the code twice would break the reference to the native global.Comment by Julian on 11 December 2007:
Have you considered trying to get this implemented in one of the framework libraries?
Comment by Steve on 11 December 2007:
Well, it’s out there, and MIT licensed. Other libraries are welcome to use it if they’d like to. Incidentally, a slightly modified version of this code will be included in the next version of my XRegExp library.
Comment by Dale Janssen on 2 February 2008:
Can you give some examples of how you call this? It is just not clear how to implement.
Many Thanks
Comment by Steve on 3 February 2008:
@Dale, it just overrides the native split method, so you can use it as simply as something like this:
var numbers = "1:2:3".split(/:/);// -> ["1","2","3"]
or…
var numbers = "1:2:3".split(/(:)/);// -> ["1",":","2",":","3"]
Refer to the Mozilla Developer Center for more info.