JavaScript split Bugs: Fixed!
The String.prototype.split method is very handy, so it's a shame that if you use a regular expression as its delimiter, the results can be so wildly different cross-browser that odds are you've just introduced bugs into your code (unless you know precisely what kind of data you're working with and are able to avoid the issues). Here's one example of other people venting about the problems. Following are the inconsistencies cross-browser when using regexes with split
:
- Internet Explorer excludes almost all empty values from the resulting array (e.g., when two delimiters appear next to each other in the data, or when a delimiter appears at the start or end of the data). This doesn't make any sense to me, since IE does include empty values when using a string as the delimiter.
- Internet Explorer and Safari do not splice the values of capturing parentheses into the returned array (this functionality can be useful with simple parsers, etc.)
- Firefox does not splice
undefined
values into the returned array as the result of non-participating capturing groups. - Internet Explorer, Firefox, and Safari have various additional edge-case bugs where they do not follow the split specification (which is actually quite complex).
The situation is so bad that I've simply avoided using regex-based splitting in the past.
That ends now.
The following script provides a fast, uniform cross-browser implementation of String.prototype.split
, and attempts to precisely follow the relevant spec (ECMA-262 v3 §15.5.4.14, pp.103,104).
I've also created a fairly quick and dirty page where you can test the result of more than 50 usages of JavaScript's split
method, and quickly compare your browser's results with the correct implementation. On the test page, the pink lines in the third column highlight incorrect results from the native split
method. The rightmost column shows the results of the below script. It's all green in every browser I've tested (IE 5.5 – 7, Firefox 2.0.0.4, Opera 9.21, Safari 3.0.1 beta, and Swift 0.2).
Run the tests in your browser.
Here's the script:
/*! * Cross-Browser Split 1.1.1 * Copyright 2007-2012 Steven Levithan <stevenlevithan.com> * Available under the MIT License * ECMAScript compliant, uniform cross-browser split method */ /** * Splits a string into an array of strings using a regex or string separator. Matches of the * separator are not included in the result array. However, if `separator` is a regex that contains * capturing groups, backreferences are spliced into the result each time `separator` is matched. * Fixes browser bugs compared to the native `String.prototype.split` and can be used reliably * cross-browser. * @param {String} str String to split. * @param {RegExp|String} separator Regex or string to use for separating the string. * @param {Number} [limit] Maximum number of items to include in the result array. * @returns {Array} Array of substrings. * @example * * // Basic use * split('a b c d', ' '); * // -> ['a', 'b', 'c', 'd'] * * // With limit * split('a b c d', ' ', 2); * // -> ['a', 'b'] * * // Backreferences in result array * split('..word1 word2..', /([a-z]+)(\d+)/i); * // -> ['..', 'word', '1', ' ', 'word', '2', '..'] */ var split; // Avoid running twice; that would break the `nativeSplit` reference split = split || function (undef) { var nativeSplit = String.prototype.split, compliantExecNpcg = /()??/.exec("")[1] === undef, // NPCG: nonparticipating capturing group self; self = function (str, separator, limit) { // If `separator` is not a regex, use `nativeSplit` if (Object.prototype.toString.call(separator) !== "[object RegExp]") { return nativeSplit.call(str, separator, limit); } var output = [], flags = (separator.ignoreCase ? "i" : "") + (separator.multiline ? "m" : "") + (separator.extended ? "x" : "") + // Proposed for ES6 (separator.sticky ? "y" : ""), // Firefox 3+ lastLastIndex = 0, // Make `global` and avoid `lastIndex` issues by working with a copy separator = new RegExp(separator.source, flags + "g"), separator2, match, lastIndex, lastLength; str += ""; // Type-convert if (!compliantExecNpcg) { // Doesn't need flags gy, but they don't hurt separator2 = new RegExp("^" + separator.source + "$(?!\\s)", flags); } /* Values for `limit`, per the spec: * If undefined: 4294967295 // Math.pow(2, 32) - 1 * If 0, Infinity, or NaN: 0 * If positive number: limit = Math.floor(limit); if (limit > 4294967295) limit -= 4294967296; * If negative number: 4294967296 - Math.floor(Math.abs(limit)) * If other: Type-convert, then use the above rules */ limit = limit === undef ? -1 >>> 0 : // Math.pow(2, 32) - 1 limit >>> 0; // ToUint32(limit) while (match = separator.exec(str)) { // `separator.lastIndex` is not reliable cross-browser lastIndex = match.index + match[0].length; if (lastIndex > lastLastIndex) { output.push(str.slice(lastLastIndex, match.index)); // Fix browsers whose `exec` methods don't consistently return `undefined` for // nonparticipating capturing groups if (!compliantExecNpcg && match.length > 1) { match[0].replace(separator2, function () { for (var i = 1; i < arguments.length - 2; i++) { if (arguments[i] === undef) { match[i] = undef; } } }); } if (match.length > 1 && match.index < str.length) { Array.prototype.push.apply(output, match.slice(1)); } lastLength = match[0].length; lastLastIndex = lastIndex; if (output.length >= limit) { break; } } if (separator.lastIndex === match.index) { separator.lastIndex++; // Avoid an infinite loop } } if (lastLastIndex === str.length) { if (lastLength || !separator.test("")) { output.push(""); } } else { output.push(str.slice(lastLastIndex)); } return output.length > limit ? output.slice(0, limit) : output; }; // For convenience String.prototype.split = function (separator, limit) { return self(this, separator, limit); }; return self; }();
Please let me know if you find any problems. Thanks!
Update: This script has become part of my XRegExp library, which includes many other JavaScript regular expression cross-browser compatibility fixes.
Comment by carmen on 9 July 2007:
yeah so i found your page after wondering why tf a very simple regex was returning different results in Opera and Mozilla – i thought i was going insane, until finding posts like this on your blog – when i see stuff like “runs on cmucl, allegro, sbcl, LispWorks, OpenMCL”, i wonder…what did the LISP guys do that browser guys have such trouble with..
Comment by Brian on 31 July 2007:
Thanks. I used your script and it saved me a huge headache with IE not treating splits like other browswers do. This script was very well done, and I liked your validation page, also very useful.
Pingback by Grabbing Code by the 00’s on 31 July 2007:
[…] Long story short, if you’re running into problems with your split method in any browser, chances are this script fixes it […]
Comment by Marcel on 10 August 2007:
Hey this code is great, excellent job! I made some optimizations for my particular use cases because I was worried about performance using this implementation versus the native one. One thing I do often is split many (hundreds or thousands) strings with the same RegExp object and your code is reconstructing separator up to two times per split. To make this faster I added some object caching on the separator parameter so it will only reconstruct the regex the first time you split with it. Also, since cross-browser behavior with string separators is consistent I just made it use the native implementation if separator isn’t an instance of RegExp. It still passes your test page with flying colors, though I only tested Firefox 2, Safari 3, and IE6. Drop me a line if you you’d like to check out the changes and possibly absorb them into your copy.
Comment by Steve on 11 August 2007:
Marcel, I’m interested. I’d already planned to change this to use the native split method for non-regex separators if I ever got around to updating it. As for caching to avoid regex recompilation, some browsers might do that automagically, so I’d be interested in testing exactly how it affects each of the major browsers before making such a change. Finally, I believe my script might fail the test page in KHTML (as opposed to WebKit) -based browsers such as Konqueror. If that’s the case, I’d want to look into how to address that (if at all possible) before re-releasing. I’ll send you an email.
Comment by John on 16 August 2007:
Just want to say thanks for the script, it works perfectly.
Comment by Trev on 21 August 2007:
Thank you, just, thank you.
Comment by Luke on 12 September 2007:
Thank you very much for this work. This keeps my simple cross browser project simple.
Comment by Mike Cowan on 27 September 2007:
Dude,
You saved my bacon with this one! Been fighting this for a couple of days and ran across your script this morning. Fired it off and BAM! worked the first time with a RegExp that worked great in FireFox but was tanking in IE.
Thanks again.
Comment by Steve on 27 September 2007:
I’m happy to hear that this has helped you all!
I’ve just modified the script to use the native split method when non-regex separators are provided, in order to run a little faster in such cases. No other significant changes were made.
Comment by Ariel Flesler on 9 November 2007:
I’d replace:
var nativeSplit = nativeSplit || String.prototype.split;
for
String.prototype._split = String.prototype._split || String.prototype.split;
So you don’t pollute the window with globals..
Hope that helps
Comment by Steve on 9 November 2007:
@Ariel Flesler:
Moving the namespace pollution from the
window
object to theString.prototype
object (which is also available globally) makes things worse, IMO. And while you could wrap all of the code in an anonymous function to avoid adding any global variables, I think there is some benefit to keeping the native version available to other code, for testing purposes if nothing else. As for the name “_split”, I intentionally avoided that because I think it’s more likely to collide with other libraries which might do something similar.For the record, the reason I do
nativeSplit = nativeSplit || String.prototype.split
instead of justnativeSplit = String.prototype.split
is because otherwise, running the code twice would break the reference to the native global.Comment by Julian on 11 December 2007:
Have you considered trying to get this implemented in one of the framework libraries?
Comment by Steve on 11 December 2007:
Well, it’s out there, and MIT licensed. Other libraries are welcome to use it if they’d like to. Incidentally, a slightly modified version of this code will be included in the next version of my XRegExp library.
Comment by Dale Janssen on 2 February 2008:
Can you give some examples of how you call this? It is just not clear how to implement.
Many Thanks
Comment by Steve on 3 February 2008:
@Dale, it just overrides the native split method, so you can use it as simply as something like this:
var numbers = "1:2:3".split(/:/);
// -> ["1","2","3"]
or…
var numbers = "1:2:3".split(/(:)/);
// -> ["1",":","2",":","3"]
Refer to the Mozilla Developer Center for more info.
Pingback by JavaScript split problems « News from MathTran on 19 June 2008:
[…] expressions as delimiters. Other people have had this problem before and Steven Levithan has a nice article about the topic including a script which fixes the inconsistencies between the different browsers, […]
Comment by Tim Lavelle on 1 July 2008:
Afternoon Steve,
Thanks for this great script! Spent a few ours trying to get a split and regex working properly… after many google searches I found this site. I linked your script, used the proper syntax and *POOF it worked precisely as I needed it to!
Thanks man!!
You dont have a donate box, otherwise I would donate some $$ for your efforts!
Comment by Steven Levithan on 26 July 2008:
@Tim Lavelle, thanks!
I’ve just upgraded this script to v0.3. Hallvord Steen of Opera helped me spot an issue with the previous version. When using String.prototype.split, if the last match of the separator within the subject string ended at the end of the string, and the separator was capable of matching an empty string (e.g. with /a?/), a trailing empty string value was not appended to the result array even when the separator did not match an empty string in that last case. This followed Firefox’s native handling, but not the spec (which at least Opera follows correctly).
The new version of the script fixes this error. I’ve also updated the test page accordingly.
Comment by Rone on 8 August 2008:
Please forgive me for my ignorance, but how do I use this script and call the function? It looks like it’ll solve all my problems, but I’m missing how to use this to “split” out my variable?
Thanks.
Comment by 19 on 18 September 2008:
i really appreciate it!!
Comment by opensource on 30 October 2008:
I am trying to filter the values from an a textarea and pass it into a new line of array so that it populate the of a box. This works fine on mozilla and IE 7.0 but with lower version of IE it wouldnt work , rather it just selects the first line and rejects the rest.
This is the content of the textarea , I want to make it check for new lines and split with \n then add to an array
9845747594
4545454545
5454656565
6565656566
Pingback by Wackylabs.Net : Regex Split bug in JScript? on 19 November 2008:
[…] https://blog.stevenlevithan.com/archives/cross-browser-split […]
Comment by JMJimmy on 31 March 2009:
Please note that this script fails (in IE of course) if the split is done on a ~
It runs native as it’s not an instance of RegEx and adding an exception for ~ causes IE to spit out thousands of splits instead of 2 and causes FireFox’s to fail completely (though it was working fine prior). I’m going to spend some time with it, see if I can’t figure out the problem. I’ll let you know.
Comment by Steven Levithan on 14 April 2009:
@JMJimmy, I can’t reproduce the issue in IE8. Which version of IE are you using? Can you provide a script to reproduce the issue?
As you mentioned, if you split on matches of the string “~” this code will just pass the handling off to the native
String.prototype.split
. So, if there is a problem, it’s likely an issue that would occur in IE anyway.Comment by Victor on 24 April 2009:
Thanks, good script!
Why you use concat method of Array? using “push” instead may improve performance a little.
Comment by Steven Levithan on 3 July 2009:
I’ve just updated this script from version 0.3 to 1.0. The new version includes significant refactoring, and fixes a bug where the
limit
argument was not always followed consistently.Comment by David W on 30 August 2009:
I really, truly cannot thank you enough for this. Of all the bullshit we have to put up with in Javascript, rewriting String.split must be up there with the worst. 🙂
Seriously, I owe you a pint, you just saved me a few hours… 🙂
David
Comment by Ken on 10 September 2009:
Carmen, it’s not just us Lisp guys, though we do like to brag about it. Most C libraries will compile on all the major C compilers (GNU, Microsoft, Intel, etc.) as well. And it’s not because of gratuitous CPP macros, either: Plan 9 builds on all platforms without any #if/#ifdef at all (and in fact the native CPP doesn’t even *have* #if).
What’s left to say? People who write web browsers are really creative — they found ways for things to break that nobody in 50 years of computing had thought of. 🙂
Comment by jey350 on 12 November 2009:
Thanks, it’s a must have
Comment by Christopher on 27 November 2009:
Saved my bacon – thanks a million for sharing this with us all. I’ve just finished a little Javascript site – http://www.nathaliemiquel-bijoux.fr – that reads a csv table the owner can modify to update the content, and it worked fine in Firefox, Opera, Safari and Chrome, but didn’t even load in IE. Just linked to your split.js file before mine in the header and it works perfectly everywhere!
Comment by d1m1 on 16 May 2010:
I know this post is old (although the latest update to the code, according to the comments, was almost 1 year ago), but still, this script saved me from spending the rest of the day trying to figure out why my code doesn’t work (and then to find out that it’s IE’s fault). Thanks a lot!
Pingback by Comportament ciudat al functiei JavaScript split in Internet Explorer - kandrei.ro on 21 May 2010:
[…] totusi sa vad daca a mai intalnit cineva acest caz si se pare ca nu am fost singurul ghinionist. Am dat cu ocazia asta si peste o extensie care corecteaza functionalitatea functiei split in mai multe […]
Comment by Newbie-me on 19 June 2010:
Great! Great! Great! Your the Man… Thanks a lot
Comment by Clare on 15 July 2010:
Thanks. Great post. It helped me identify why I had a bug in Chrome.
Comment by Johnny on 20 July 2010:
Thanks for the sanity check (and reliable solution).
-j
Comment by Jerry on 5 August 2010:
Excellent solution! Top notch! Instantly solved an issue I was having using split() with Firefox.
Thank you Sir!
Comment by MT on 15 September 2010:
This is awesome. Works great on fixing “split” (which was my immediate issue), and I wonder how many other incompatibilities I’m never even going to see now that I’ve dropped in your script. Thank you!
Comment by Joco on 26 September 2010:
Thanks, thanks. Opera it’s doing the job but IE and others must have this fix.
Thank you.
Comment by Terry S on 15 October 2010:
You rock. That’s all I can say….and thank you!
Comment by jonathan on 11 February 2011:
Seriously this script is amazing – saved me such a headache – simply attached to my document and my reg ex split magically worked in ie – que fist pump and virtual high 5 !
Pingback by Fun with jQuery Templating and AJAX - Tutorial Plus on 28 February 2011:
[…] we can use an excellent JavaScript patch created by Steven Levithan. It can be downloaded from: https://blog.stevenlevithan.com/archives/cross-browser-split and can be included in the page using a conditional comment in the same way that we added the […]
Pingback by ?? IE split ???? | Wang Jun's Blog on 3 March 2011:
[…] ????/(,)/ ?????? , ?????IE???https://blog.stevenlevithan.com/archives/cross-browser-split???????????2???????????????? […]
Pingback by jQuery ??????? ? AJAX | NET Tuts on 14 March 2011:
[…] ???? JavaScript ???????? Steven Levithan. ?? ????? ???? ???????? ?: https://blog.stevenlevithan.com/archives/cross-browser-split ? ????? ???? ???????? ? ???????? ? ??????? ????????? […]
Comment by thanhnv on 2 April 2011:
Thanks for you tip, mate 🙂
Comment by Mark on 14 May 2011:
Thanks for this code — awesome!
You probably already know this, but I note that IE9 produces correct results on every test. Compatibility mode produces the ‘correct’ incorrest results, as well.
Not much use to us while there are still so many non-compliant browsers out there, but kudos to Microsoft where it’s due (for once).
Comment by Jason on 28 July 2011:
Thanks for the posting. It helped me a lot!
Comment by Dave Merrill on 18 August 2011:
Another happy guy with his IE8 mystery solved. Great stuff, too bad it’s needed.
Comment by Dave Merrill on 18 August 2011:
!too (~_~)
Comment by Dave on 13 September 2011:
Wow, I was just running into compatibility issues between IE and chrome (chrome passes all the tests correctly). I was dreading having to write my own split, and you’ve already done it brilliantly. Thank you so much.
Pingback by Javascript Split | Mark Design on 26 September 2011:
[…] JavaScript split Bugs: Fixed! I’ve also created a fairly quick and dirty page where you can test the result of more than 50 usages of JavaScript’s split method, and quickly … […]
Comment by Bora on 30 September 2011:
Do you have fix for Regexp exec method on IE? It returns empty string on a failed match instead of undefined.
Comment by Bora on 30 September 2011:
Never mind XRegExp has solved it all. Thanks for the great regexp tool.
Pingback by Regex Mysterium « am530: Der Blog von Axel Michel on 6 November 2011:
[…] reguläre Ausdruck bei zwei Methoden so unterschiedlich? Steven Levithan beschreibt in seinem Artikel JavaScript split Bugs: Fixed! gleich eine ganze Reihe von Fehlern (die nicht nur den Internet Explorer betreffen), und liefert […]
Pingback by Bottle Cap-O-Rama Blog on 15 December 2011:
[…] This script helped deal with empty csv values in IE: […]
Comment by Mike Nelson on 5 January 2012:
Beautiful! Thank you!
Comment by Alex Barron on 30 January 2012:
Thanks for this, it is just what I needed. I was splitting a date range string like so: “my_field:[2012-01-03T00:00:00Z TO 2012-01-24T23:59:59.999Z]” on the first occurrence of a colon with /\:(.*)?/ and was working in non-IE browsers. IE was simply splitting on the first colon and discarding the rest of the string.
cbSplit fixes this and I am eternally grateful.
Alex
Comment by Tsutomu Kawamura on 11 March 2012:
Thank you for your great job!
I ported it to CoffeeScript.
https://gist.github.com/2015450
Comment by Steven Levithan on 12 March 2012:
@Tsutomu Kawamura, cool. 🙂
I’ve just upgraded this script from v1.0.1 to v1.1. This fixes how the script handles very large numbers provided as the
limit
argument (e.g.,Infinity
orMath.pow(2,32)+1
). They are now converted to very small numbers, per the spec rules. (Issue reported by Brian O.)I’ve also changed the function name from
cbSplit
tosplit
.Pingback by Javascript split() not working in IE | Easy jQuery | Free Popular Tips Tricks Plugins API Javascript and Themes on 21 May 2012:
[…] [Edit] From Steven Levithan’s Blog: […]
Comment by inirtiems on 29 May 2012:
Who and where to arrange this summer on festival, share your information.
Comment by AS on 20 June 2012:
OK. This works! Great job.
This is the first time I’ve ever encountered a Javascript bug. I’ve dealt with so many cross-browser inconsistencies arising from HTML and CSS that I honestly believed they were the only kind out there. Nary a single Javascript bug.
Now I have to Google “Javascript bugs” and find out all the unpleasantries I’ve been missing out on 🙁
PS: To anybody with a useful site that serves that purpose, please provide a link
Comment by Monir on 21 June 2012:
Thank you dude. good job!!
Comment by Xander on 30 July 2012:
Hi! Thanks for the script.
I have a bug though in IE8. Im doing a background position animation.
It splits background-position in x and y.
— preview —
x = ele.css(‘background-position’),
y = x.split(‘ ‘),
z = parseInt(y[1].replace(/px/, ”))+157;
—
You’re script fails in helping me solve this. Do you know why?
Comment by Steven Levithan on 30 July 2012:
@Xander, your example code isn’t helpful in determining what the issue with the split is, if any. All that is relevant is the actual value of the target (the value of your
x
variable, which is not shown), the separator, the optional limit argument, and the output (again, not shown in your code). Also note that to use the latest version of this function, you need to call the globalsplit
function itself, not the native split method of strings (which is not overridden). I.e., you should callsplit(x, ' ')
. Finally, note that thesplit
function simply passes to the nativeString.prototype.split
method if the separator is not a regex.Comment by www.tonitech.com??? on 11 October 2012:
woo???Thank you very much?It works?My program could work in IE?
Comment by Ian on 11 October 2012:
Amazing piece of work!!
Pingback by JavaScript?split???IE????????????? | Tony????? on 11 October 2012:
[…] ??????????????????JavaScript???split???????????????????????????????IE????????????????https://blog.stevenlevithan.com/archives/cross-browser-spli ?????????????? […]
Comment by Jussi on 8 December 2012:
Hello!
I’m getting following issue on IE9:
SCRIPT5007: String.prototype.split: ‘this’ is null or undefined
test.js, line 44 character 13
Line 44 looks like this:
return nativeSplit.call(str, separator, limit);
Works OK on Firefox and Chrome.
Comment by generic on 30 April 2014:
Hello!
Comment by Max White on 25 May 2014:
Thankyou very much. This saved me time. 🙂
Comment by Carlos L on 11 August 2014:
Thank you!!! BTW. The link to download is broken. I just copied the text provided in this page.
Comment by ilmiweb.com on 2 December 2014:
Very Nice. It helped. Thanks.
Pingback by WTFJS: Regex Mysterium - Web and App development - blog on 8 February 2016:
[…] All this because of a(nother) buggy implementation in some browsers. Steven Levithan describes a whole series of errors regarding the split functionality (not only concerning the Internet Explorer) and offers […]
Pingback by Replacing the nth instance of a regex match in Javascript [ANSWERED] - Tech ABC to XYZ on 19 February 2016:
[…] incorrectly splits strings using a regex, as discussed here. [shakes fist at IE7]. I believe that this is the solution; if you need to support IE7, good […]
Comment by Jun Aruga on 25 July 2016:
Hello,
I am contributing https://github.com/lautis/uglifier recently.
May I ask you a question about your split.js?
Could you upload the split.js to your github or another git repository?
Because blow module looks using the split.js, with modifying a little bit.
I want to watch the split.js’s update, and manage your split.js as this module’s submodule.
https://github.com/lautis/uglifier/blob/master/lib/split.js
Thanks.
Best regards,
Jun Aruga
Comment by Pablo on 22 September 2016:
Hi!
Thanks for this, but keep in mind that if this is loaded twice (maybe because some other third-party on the website is actually using it) it would end up in a infinite recursive loop:
Uncaught RangeError: Maximum call stack size exceeded
Pingback by regex - JavaScript: split no funciona en IE? on 18 January 2019:
[…] vez esta página es de uso: blog.stevenlevithan.com/archives/cross-browser-split Me pregunto, son aquellos que clase de idiota cosas, es decir, son errores o características que […]
Pingback by regex - JavaScript: split ne fonctionne pas sous IE? on 18 March 2019:
[…] que cette page est d'utilisation: blog.stevenlevithan.com/archives/cross-browser-split Je me demande, sont de ce genre d'idiots trucs dans IE sont des bugs ou des caractéristiques […]
Pingback by javascript split regex bug in IE7 – Config9.com on 7 August 2019:
[…] this old blog post for a possible solution to the variation in handling of captured groups in .split() […]
Pingback by ???????????????_javascript?? on 21 October 2020:
[…] 这是链接:http : //blog.stevenlevithan.com/archives/cross-browser-split […]