parseUri 1.2: Split URLs in JavaScript
I've just updated parseUri. If you haven't seen the older version, parseUri is a function which splits any well-formed URI into its parts, all of which are optional. Its combination of accuracy, flexibility, and brevity is unrivaled.
Highlights:
- Comprehensively splits URIs, including splitting the query string into key/value pairs. (Enhanced)
- Two parsing modes: loose and strict. (New)
- Easy to use (returns an object, so you can do, e.g.,
parseUri(uri).anchor). - Offers convenient, pre-concatenated components (path = directory and file; authority = userInfo, host, and port; etc.)
- Change the default names of URI parts without editing the function, by updating
parseUri.options.key. (New) - Exceptionally lightweight (1 KB before minification or gzipping).
- Released under the MIT License.
Try the demo, but make sure to come back and read the details below.
Details:
Older versions of this function used what's now called loose parsing mode (which is still the default in this version). Loose mode deviates slightly from the official generic URI spec (RFC 3986), but by doing so allows the function to split URIs in a way that most end users would expect intuitively. However, the finer details of loose mode preclude it from properly handling relative paths which do not start from root (e.g., "../file.html" or "dir/file.html"). On the other hand, strict mode attempts to split URIs according to RFC 3986. Specifically, in loose mode, directories don't need to end with a slash (e.g., the "dir" in "/dir?query" is treated as a directory rather than a file name), and the URI can start with an authority without being preceded by "//" (which means that the "yahoo.com" in "yahoo.com/search/" is treated as the host, rather than part of the directory path).
Since I've assumed that most developers will consistently want to use one mode or the other, the parsing mode is not specified as an argument when running parseUri, but rather as a property of the parseUri function itself. Simply run the following line of code to switch to strict mode:
parseUri.options.strictMode = true;
From that point forward, parseUri will work in strict mode (until you turn it back off).
The code:
// parseUri 1.2.2 // (c) Steven Levithan <stevenlevithan.com> // MIT License function parseUri (str) { var o = parseUri.options, m = o.parser[o.strictMode ? "strict" : "loose"].exec(str), uri = {}, i = 14; while (i--) uri[o.key[i]] = m[i] || ""; uri[o.q.name] = {}; uri[o.key[12]].replace(o.q.parser, function ($0, $1, $2) { if ($1) uri[o.q.name][$1] = $2; }); return uri; }; parseUri.options = { strictMode: false, key: ["source","protocol","authority","userInfo","user","password","host","port","relative","path","directory","file","query","anchor"], q: { name: "queryKey", parser: /(?:^|&)([^&=]*)=?([^&]*)/g }, parser: { strict: /^(?:([^:\/?#]+):)?(?:\/\/((?:(([^:@]*)(?::([^:@]*))?)?@)?([^:\/?#]*)(?::(\d*))?))?((((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/, loose: /^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:@]*)(?::([^:@]*))?)?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/ } };
You can download it or run the test suite.
parseUri has no dependencies, and has been tested in IE 5.5–7, Firefox 2.0.0.4, Opera 9.21, Safari 3.0.1 beta for Windows, and Swift 0.2.


Comment by Iván Montes on 29 June 2007:
Nice function. I miss the ability to get the query string as a list of key/value pairs, with a second argument to specify the parameter separator (with the ampersand by default). It’ll make the code a bit longer but the function would become much more complete.
Comment by Steve on 29 June 2007:
Hi Iván, thanks. parseUri already returns key/value pairs for the query string in an object called queryKey. For example, to access the value of a query key called “search” you could write parseUri(uri).queryKey.search
You can see this in action on the test page, when you click the Parse button. Does that meet your needs, or were you thinking of something different?
Comment by Iván Montes on 29 June 2007:
Sorry Steve, I didn’t notice it. That was exactly what I was talking about!
However, a minor improvement would be to accept as optional second parameter a string which holds in each char a possible argument separator, most people use the ampersand but according to the RFC any char except ‘?’ and ‘#’. See my post http://blog.netxus.es/blog/url-argument-separator
I use myself the semi-colon ‘;’ as argument separator in some projects, since there is no need to escape it when used in XML/XHTML documents.
Comment by Steve on 29 June 2007:
Iván, I have no plans to support arbitrary delimiters in the query string because that would significantly complicate the code for the benefit of probably less than 0.1% of developers, and as you noted in your blog post, server side languages like PHP, ASP.net, etc. generally don’t support delimiters other than “&” without special configuration, if at all.
However, it would be easy to manually change the code to support both “&” and “;” delimiters in your personal copy. Just change
/&?([^&=]*)=?([^&]*)/gfrom within theuri.query.replace()statement to/[&;]?([^&;=]*)=?([^&;]*)/g(or, to support only semicolons, use/;?([^;=]*)=?([^;]*)/g).In any case, this does not affect the main URI parsing (only the splitting of the query string into key/value pairs), so you could also implement a separate function to specifically work with the query string with maximum flexibility.
Comment by Iván Montes on 29 June 2007:
Steve,
I get your point and quite agree with it. What about placing the query RE in the options object so it can be easily modified by those with especial needs?
Comment by kangax on 29 June 2007:
wow, that’s one crazy regexp right there…
Comment by Steve on 30 June 2007:
@Iván Montes,
Good call. I’ve gone ahead and moved the query regex as well as the query object’s name into the options object and upped the version number from 1.1 to 1.2. Thanks for the suggestion.
Pingback by links for 2007-07-01 « [[ the sirens of titan ]] on 1 July 2007:
[...] parseUri 1.2: Split URLs in JavaScript easy url parsing (tags: code library parsing programming javascript url web webdev uri) [...]
Pingback by Вот как-то не пишетÑÑ.. « О PHP и о жизни… on 1 July 2007:
[...] ParseURL на ÑваÑкрипте – ParseURL на ÑваÑкрипте
демо [...]
Trackback by dev2 - webfejlesztés on 2 July 2007:
parseUri 1.2: JavaScript URL feldolgozó…
A php parse_url függvényét már ismerjük. Most ismerjük meg ugyanezt javascripthez is.
Steven Levithan: parseUri 1.2: Split URLs in JavaScript
Script:
/* parseUri 1.2; MIT License
By Steven Levithan <http://stevenlevithan.com> */
var pars…
Pingback by links for 2007-07-02 | IndianGeek on 2 July 2007:
[...] parseUri 1.2: Split URLs in JavaScript (tags: javascript parsing programming opensource) [...]
Pingback by Daily misery » Blog Archive » Links for 6.29.2007 through 7.2.2007 on 2 July 2007:
[...] parseUri 1.2: Split URLs in JavaScript [...]
Comment by Thomas Messier on 3 July 2007:
Hey Steven,
I couldn’t find any contact info so I’m just leaving a message here. I’m the maintainer of the CFJSON project and I’m trying to fix some things and would benefit from some regexp help and I know you’re quite good with them. If you think you could give me a hand shoot me an email (I put my email in the comment form) and I’ll tell you what I’m trying to fix. Something tells me you’ll be able to solve my problem without too much effort. Thanks in advance.
Comment by Steve on 3 July 2007:
Thomas, I just sent you an email. I’ll try to help if I can.
Pingback by 17 Links Today (2007-10-31) on 31 October 2007:
[...] parseUri 1.2: Split URLs in JavaScript awesome [...]
Comment by Ariel Flesler on 9 November 2007:
Just as a possibility, if you create an anchor, assign the url as href, then you can access the anchor’s host,port,protocol,search,hash. This worked in FF dunno if it will work in the rest. I’m just saying this because it might make your code shorter
I hope it helped
Comment by Steve on 9 November 2007:
@Ariel Flesler:
It definitely would not make the code shorter, if you wanted to keep the same functionality as is currently provided. But still, that’s an interesting idea, if it works.
Comment by Anil Gulati on 11 March 2008:
It looks like it doesn’t support the correct / standard URL parameter delimiter which is actually the ampersand entity ‘&’ not a raw ampersand ‘&’. That’s precisely the functionality I am looking for as I’m getting an annoying error cropping up with one of my scripts that just uses a plain javascript split on ‘&’.
That said, here are some of the top 10 xhtml errors:
1. The use of a raw amperstand in a link query string. The w3c validator reports this as “cannot generate system identifier for general entity†because you’ve tried to create a new entity &xxxxxxx and not an encoded & amp ; in the string. Replace all & with & in urls.
http://elliottback.com/wp/archives/2005/08/14/ten-steps-to-valid-html/
Comment by Steve on 11 March 2008:
Um, no. This code deals with URIs, not HTML. And of course,
&is not the only HTML entity to deal with.Comment by Mark van Leeuwen on 12 March 2008:
As Safari 2.0 users may have noticed, the ‘queryString’ part isn’t filled because this version does not support a function as second parameter of String.prototype.replace()
Comment by zcrux on 1 May 2008:
I get the following error using your code
o has no properties
[Break on this error] m = o.parser[o.strictMode ? "strict" : "loose"].exec(str),
Please let me know.
Thanks in advance!
Neo
Comment by Steve on 1 May 2008:
@zcrux, does the demo page work for you?
Comment by Raj on 20 May 2008:
Hey there,
Is the online demo using the same version of the JS that is available for download?
I have a URL that will parse in the demo just fine, but returns undefined when doing this: document.write(parseUri(urls).queryKey.q);
This is the URL btw: http://search.yahoo.com/search?p=flavor+flav&fr=yfp-t-501&toggle=1&cop=mss&ei=UTF-8
Comment by Steven Levithan on 20 May 2008:
@Raj, yes, the demo uses the same code, which you could have easily verified yourself (the source files are uncompressed). Your URL does not contain a
qkey in the query, so the line of code you posted above is working correctly.Comment by Raj on 20 May 2008:
Steven,
My apologies. Initially, I felt some apprehension about browsing straight to the JS files in the demo. Just being respectful.
Then I got over it.
Raj
Comment by Hat on 29 May 2008:
Thank you thank you thank you! In a short few weeks I’ve built at least two functions on top of this little gem and all my pages have components that will depend on them. Such a blessing!
Comment by Kyle Simpson on 9 July 2008:
I *love* this function, it is so incredibly powerful and helpful! Thank you so much for it.
For an open-source project I’m working on, I needed this same functionality inside of a Flash SWF. So, I’ve ported your 1.2.1 code to this regular AS3 function (not an object/class, though that would be easy to get from what I’ve done, too!).
Since escaping all that reg-ex stuff to post here in the comments would be ridiculous, I’m going to post a URL here that can be used to retrieve a text file with the code in it.
http://www.flensed.com/parseUri-AS3.txt
Steven, if want to, grab that text file and place the formatted code somewhere on this page or in this comment, that way people who come here later won’t have to go to my site to find it.
Comment by Steven Levithan on 9 July 2008:
@Kyle Simpson and @Hat, thanks! Kyle, I’ll post your AS3 port here for posterity, but there’s no reason people shouldn’t get it from your site!
// **************************** // Ported by Kyle Simpson from Javascript to AS3 from: // parseUri 1.2.1 // (c) 2007 Steven Levithan <stevenlevithan.com> // MIT License // **************************** public function parseUri(str:String, strictMode:Boolean=false):Object { var o:Object = new Object(); o.strictMode = strictMode; o.key = new Array("source","protocol","authority","userInfo","user","password","host","port","relative","path","directory","file","query","anchor"); o.q = new Object(); o.q.name = "queryKey"; o.q.parser = /(?:^|&)([^&=]*)=?([^&]*)/g o.parser = new Object(); o.parser.strict = /^(?:([^:\/?#]+):)?(?:\/\/((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?((((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/ o.parser.loose = /^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/ var m:Object = o.parser[o.strictMode ? "strict" : "loose"].exec(str); var uri:Object = new Object(); var i:int = 14; while (i--) uri[o.key[i]] = m[i] || ""; uri[o.q.name] = new Object(); uri[o.key[12]].replace(o.q.parser, function ($0, $1, $2) { if ($1) uri[o.q.name][$1] = $2; }); return uri; }Comment by Antony on 14 July 2008:
Big thanks for that amazing function! It helps me a lot with my website programming!
Comment by Andrew Zitnay on 8 August 2008:
Nice job Steve, very useful… I thought I’d point out a real-world example of a slight problem I saw while using your code, though. For a URL like:
http://www.zitnay.com/stuff/badurl.php?param=test@test
it detects everything before the “@” as the userthus screwing up the rest of the parse.
I realize the “@” should really be URL encoded to “%40″, but like I said, I found a real-world example of this on a live website. So, you might consider adding support for this case, at least in the loose version.
Comment by Brendan on 11 August 2008:
Brilliant! Thanks, this code is powerful yet very friendly allowing non programmers (like myself) to implement it with ease
Comment by lucky on 21 August 2008:
Thanks, nice code!
I just change the query string parse function. This decode query parameters and distinct ‘key&’ from ‘key=&’: first get ‘true’ as value, last — empty string.
[code]
uri[o.key[12]].replace(o.q.parser, function ($0, $1, $2, $3) {
if ($1) uri[o.q.name][decodeURIComponent($1)] = $2 ? decodeURIComponent($3) : true;
});
...
parseUri.options = {
...
q: {
name: "queryKey",
parser: /(?:^|[&])([^&=]*)(=?)([^&]*)/g
},
[/code]
And small shugar for strings:
[code]
// usage: "http://blog.stevenlevithan.com/archives/parseuri".parseUri().
String.prototype.parseUri = function () { return parseUri(this.valueOf()); };
[/code]
Comment by Mark Perkins on 29 August 2008:
Hi Steven,
Just to let you know I have put together a jQuery plugin based on your URI parser, which includes a bit of added functionality.
You can check it out here – http://projects.allmarkedup.com/jquery_url_parser/
Let me know if you have any suggestions/improvements etc! And thanks for the excellent work.
Comment by Robert on 29 August 2008:
Steve,
I was integrating your parser and noticed that the query parser function is “lossy”. For example, given the query string:
“a=1&b=2&c=1&c=2&c=3″
Parsing will result in this query hash: { a: 1, b: 2, c: 3 }
Here is a rewritten parser function:
if ($1) {
if (! b9j.isValue(queryHash[$1])) {
queryHash[$1] = $2;
}
else if (b9j.isArray(queryHash[$1])) {
queryHash[$1].push($2);
}
else {
queryHash[$1] = [ queryHash[$1], $2 ];
}
}
You can find a full implementation here: http://appengine.bravo9.com/b9j/documentation/uri.html
Comment by Robert on 29 August 2008:
…oh yeah, and great parser, thanks!
Comment by Raju on 5 September 2008:
Awesome!!!! kudos Steven for such an elegant program! It served my purpose exactly the way i needed it!
Thanks again….
Raju
Comment by Brendan on 11 September 2008:
Hi
I’m new to jQuery and stumbled across your script whilst doing a uni assignment. I’m a bit lost as to how to recover array query strings. ie url?=campus[]=blahland.
If I do parseUri(location).queryKey.campus%5B%5D it reports a JS error in Firebug.
Is there a way to strip the [] from the query string so it’ll just be campus?
Cheers,
Brendan
Comment by Rob on 12 September 2008:
I’ve posted an interactive example of my
JavaScript URI object:
http://appengine.bravo9.com/b9j/example/uri/
Comment by Steven Levithan on 13 September 2008:
@Brendan, what’s an array query string? Assuming the URI you’re feeding this function is actually “url?campus[]=blahland”, you could access the value via
parseUri(uri).queryKey["campus[]"]. Them’s JavaScript rules.@Robert, good point about e.g. “?c=1&c=2&c=3″. I’ll have to consider how to handle such cases in the next version of this function. Your approach (inserting an array into
queryKey) seems pretty reasonable.Pingback by Java: Matching URLs with Regex Wildcards » Leghumped on 3 November 2008:
[...] After someone suggested a way to match URLs and protocols with wildcards in LockCrypt, I started work implementing a URL which accepted wildcard (*) characters. The result is a class which takes a URL string as a constructor and breaks it apart into it’s component parts. The class is based on a JavaScript regex from Steve Levithan. [...]
Pingback by links for 2008-11-24 « denny on 24 November 2008:
[...] parseUri 1.2: Split URLs in JavaScript (tags: javascript parse regex url uri) [...]
Comment by Lex on 7 December 2008:
Nice script! Could you please add support for domain/subdomain?
Comment by kvz on 25 January 2009:
Hello Steven,
We would like to use your excellent code in our project over at
http://kevin.vanzonneveld.net/techblog/article/phpjs_licensing/
and in the near future at:
http://phpjs.org
We already noticed your code was MIT, but if you would like to be credited differently or have another comment, please drop a line okay?
Comment by Gui on 15 March 2009:
chrome?
Comment by Carl Armbruster on 19 March 2009:
awesome! – nuff said!
Comment by msznapka on 2 June 2009:
Cool script, but I am missing one important function. Function which creates URI back from uri object. My scenario is: parse URI, change URI (query string), write URI back to <a href=”…
Comment by Matt Ruby on 3 June 2009:
Great script! This is working really well for us with one exception. In Safari and Chrome the following URL will not parse:
(Edit: Long URL removed.)
I know it’s crazy… It appears to begin working when we limit the length to 450.
Thanks again for the great script!
Comment by Steven Levithan on 3 June 2009:
@Matt Ruby, I haven’t done any related testing, but the problem may result from the portions of the regexes that deal with user info (user name, password). Those parts can result in a lot of backtracking with long URLs that don’t contain an @ sign, since JavaScript doesn’t have features such as possessive quantifiers, atomic groups, or duplicate subpattern numbers that would help me deal with the backtracking issues.
If you don’t need support for the
userandpasswordproperties returned by this script, one easy way to work around the issue is to change the following part of the regex (in both the strict and loose version):(?:(([^:@]*):?([^:@]*))?@)?To this:
(?:([^:@]*:[^:@]*|[^:@]*)?@)?Then remove the “user” and “password” values from the
parseUri.options.keyarray. If you try this out, please let me know if it solves the issue for you.Comment by Steven Levithan on 3 June 2009:
@msznapka, on the demo page, if you look at the source, in the demo.js there’s a function called
formatthat does something similar, and may be useful as a starting point for you.Comment by Matt Ruby on 3 June 2009:
Works like a charm!
I also changed i = 14; to i = 12;
and uri[o.key[12]]… to uri[o.key[10]]…
Thanks for your help!
-Ruby
Comment by Steven Levithan on 4 June 2009:
@Matt Ruby, cool, thanks for reporting back. After giving this a few more minutes of thought, here’s a way you can get rid of the backtracking problem while keeping the
userandpasswordproperties around. Replace the pattern I identified earlier (in both the strict and loose regexes) with this:(?:(([^:@]*)(?::([^:@]*))?)?@)?Everything else should be left the same compared to the original script. When I have some time to more fully review all of parseUri, I’ll include this change in the next version (after the current v1.2.1).
Comment by Matt Ruby on 5 June 2009:
Thanks again! I’ve made your suggested change and things are still working well.
I look forward to next version.
-Ruby
Comment by LudoO on 18 June 2009:
I did a PHP port of this amazing function.
I added 2 features.
Hope it could be useful !!
Have fun !
LudoO
<?php /* PhpParseUri 1.0 PHP Port of parseUri 1.2.1 - added file:/// - added : windows drive detection PHP Port: LudoO 2009 <pitaso.com> Original JS : (c) 2007 Steven Levithan <stevenlevithan.com> MIT License */ function parseUri($str) { global $parseUri_options; $o = $parseUri_options; $r = $o['parser'][$o['strictMode'] ? "strict" : "loose"]; preg_match($r, $str, $m); $uri = array(); $i = 15; while ($i--) $uri[$o['key'][$i]] = $m[$i]; $uri[$o['q']['name']] = array(); preg_match_all($o['q']['parser'], $uri[$o['key'][13]], $n); if ($n && sizeof($n)>0){ for ($i = 1; $i <= sizeof($n); $i++) { $v =$n[$i]; if ($v) $uri[$o['q']['name']][$v[0]] = $v[1]; } } return $uri; }; $parseUri_options = array( strictMode => false, key => array("source","protocol","authority","userInfo","user","password","host","port","relative","path","drive","directory","file","query","anchor"), q => array( name => "queryKey", parser => '/(?:^|&)([^&=]*)=?([^&]*)/' ), parser => array( strict => '/^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/', loose => '/^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/' ) ); ?>Pingback by Parsing URLs in Javascript | Pressing the Red Button on 29 June 2009:
[...] Javascript does not have a built in method for parsing a url (to get the individual parts that make up a url). Here is a link to a great little function that does all you need, called parseURI. [...]
Comment by j on 25 July 2009:
this function does not work for urls with ipv6 addresses in them
an example url:
http://2001:4860:a003::68]/search?q=parseUri+ipv6
Comment by Jeff Jackson on 17 September 2009:
Thanks for this! One small issue: if a URL has multiple instances of the same parameter name (as can occur if multiple checkboxes are checked on a form) then queryKey will only contain the last value associated with this name. So if the query string is
x=a&x=b
then queryKey.x ends up as ‘b’, and the x=a pair is lost. It might be nicer if, say, an array of strings rather than a single string was assigned to the corresponding element of queryKey in such cases.
Comment by Ildar on 2 October 2009:
Hi Steven,
Once, i have implemented the URI parser in JavaScript. It looks like the method of String prototype, so you are able to parse input strings in JavaScript-style:
var url = ‘http://blog.stevenlevithan.com/archives/parseuri’;
var url_parsed = url.parseUrl();
The source is located here:
http://with-love-from-siberia.blogspot.com/2009/07/url-parsing-in-javascript.html
Pingback by Trial and Erorr | Backseat Surfer on 14 December 2009:
[...] einem keine Lösungen anbietet, um relative Pfade in Absolute zu konvertieren. Es gibt sogar mehrere sehr gute Lösungen. Doch will ich wirklich 100 Zeilen Code schreiben nur um Links zu [...]
Comment by Mike on 29 January 2010:
I was using your parseUri class in JavaScript and noticed that the path “file.ext” is not considered a file by your script. It looks like it is expecting a preceding slash. I don’t know if this is a bug or just a feature that was left out. My work around was just some extra conditioning tests. Nice work on this class and thanks for posting it.
Comment by Steven Levithan on 3 February 2010:
@Mike, in non-strict mode, a URL starting with a filename (i.e., no preceding slash) is treated as a host, for reasons explained elsewhere. Switch to strict mode and you should be OK.
Comment by Carter Cole on 10 February 2010:
your code is AWESOME! thanks for your stuff… i stole your dump function too
ive been looking for something like it for awhile
Comment by Anup Chatterjee on 22 April 2010:
Great and simple parser. But only one issue I have here – the open social type of url links do not work with your parser, e.g.
http://blog.stevenlevithan.com/user/@me/folder/@root
Pingback by Golingo: a great Titanium mobile Web game, open sourced for you | Intertech on 19 May 2010:
[...] Uri-parser [...]
Comment by Scott on 27 May 2010:
Hi Steven,
Thanks for the great code! If you’re still maintaining it, here’s an example of a url that has issues:
http://www.contemporaryartdaily.com/wp-content/uploads/2010/05/4.-AC-2010-InstallShot@Maccarone-NorthGallery-150×150.jpg
Comment by Steven Levithan on 28 May 2010:
@Scott, that URL is invalid. The @ sign is supposed to be URL encoded as “%40″–if you make this change, parseUri will handle it fine.
FYI, you’re not the first person to request different handling for invalid uses of “@” (see Anup Chatterjee and Andrew Zitnay’s comments here), so perhaps it’s worth looking at changes to “loose” parsing mode, at least, in future versions of this script.
Comment by James on 2 June 2010:
Awesome code. Ultimate Regex.
I found this letter >a< knocking around, which i present to you for insertion into the appropriate position in your surname.
Pingback by dovapour?blog » JavaScript??URL on 10 June 2010:
[...] stevenlevithan?parseUri 1.2: Split URLs in JavaScript?????? [...]
Comment by Brandon Sterne on 11 June 2010:
Do you think it’s a bug that jQuery.url.setUrl(“script.js”).attr(“host”) returns script.js?
PHP’s parse_url, for example, is able to parse that string as the path.
Comment by Brandon Sterne on 11 June 2010:
Sorry, strict mode seems to address that, but the jQuery plugin doesn’t seem to use strict mode properly. I will investigate on that end. Thanks for the great parser!
Comment by Neeta on 2 July 2010:
Nice Function. It works 99% of the times..
I tried it with google search results url as given below and it broke.. query is null!!
http://www.google.com/#hl=en&source=hp&q=dvd+player&aq=f&aqi=g10&aql=&oq=&gs_rfai=CHmF57nEuTJ-4E5-yMcr6vYgKAAAAqgQFT9CPQHY&fp=7e78d8b98f604090
Comment by Steven Levithan on 4 July 2010:
@James, lol.
@Neeta, parseUri returns the correct result. Everything after the # sign is the URI fragment (aka anchor). There is no query part.
Comment by Rien Broekstra on 13 July 2010:
Dear Mr. Levithan, hello Steven,
Thanks a lot for your javascript magic.
I have a feature suggestion if that’s appropriate. Since parseUri is returning an object, would it be a decent idea to add a method which would reassemble the URL back to a string from its parts?
That could be quite handy for actually manipulating URL’s. One could then alter parts of the URL (add or modify parts of the query, the user information, whatsoever) without doing any regexp magic in their own code.
Cheers,
–
Rien
Comment by Johann on 17 July 2010:
@Rien,
there is a source property in the returned object that contains the original string.
Thanks Steven for this script, I use it together with a Punycode encoder on some proxies to support IDN domains and parseUri has helped a lot keeping the code small.
Comment by Leechael on 12 August 2010:
It break on URL like http://www.blahblah.com/@foo/bar ….
Comment by ???? ???? on 29 August 2010:
Thank you very much. Your code is awesome.
Comment by Adam on 1 September 2010:
Very nice! It does break on the URL http://www.blahblah.com/@foo/bar as Leechael noted, but still, for how simple it is, I am duly impressed. My MUCH longer version of a strict RFC-3986 parser written in C is over at github (http://github.com/ajrisi/fsm). I didn’t use regex like you, I used a hand-rolled finite state machine. The output isn’t quite as readable as yours either. Still, might be worth something to someone!
Pingback by Parsowanie URL w JavaScript at Jakub Laskowski on 10 November 2010:
[...] tu z pomoc? przychodzi funkcja parseUri stworzona przez Stevena Levithan. Jej zwi?z?o?? jest wr?cz zadziwiaj?ca. U?ycie jest bardzo proste: wywo?ujemy funkcj? [...]
Comment by ridgerunner on 11 November 2010:
Actually, the ‘@’ sign is a perfectly valid character for the path, query and fragment portions of a URI according to RFC3986 and does not need to be encoded as ‘%40′.
Look at the ABNF definition for ‘pchar’ in Appendix A of RFC3986.
Comment by Steven Levithan on 12 November 2010:
@ridgerunner, thanks for the details. I will correct for that in future versions on this script.
Comment by User on 1 February 2011:
Just wanna say: AWESOME WORK!
Comment by Jens Weiermann on 1 April 2011:
Hi,
I’ve taken Robert’s idea of creating an array of values for parameters that were given multiple times. Unlike him, I did so without using a third party library. Here’s the code if anyone’s interested:
Replace the line
if ($1) uri[o.q.name][$1] = $2;
with
if ($1) {
if (uri[o.q.name][$1] === undefined) {
uri[o.q.name][$1] = $2;
} else if (typeof uri[o.q.name][$1] === ‘[object Array]‘) {
uri[o.q.name][$1].push($2);
} else if (typeof uri[o.q.name][$1] === ‘string’) {
uri[o.q.name][$1] = [ uri[o.q.name][$1], $2];
}
}
Pingback by javascript?URL???????????parseUri? « kawama.jp on 5 April 2011:
[...] http://blog.stevenlevithan.com/archives/parseuri var pu = new parseUri(window.location.href); alert(pu.path); ???????????? [...]
Pingback by MikeCann.co.uk » Blog Archive » URI Parser For HaXe on 11 April 2011:
[...] I was in need of a way to split a URL into its various parts. To do this in previous versions of ChromeCrawler I used a ready built one I found on the web. [...]
Comment by jojo on 7 May 2011:
great stuff.
i just love regexp.
very cool
Comment by Brad on 25 May 2011:
Evil corner case >:)
http://www.test.com/path?__proto__=1
Pingback by Javascript URI parser | Jeff Wang's Blog on 27 June 2011:
[...] Refer to http://blog.stevenlevithan.com/archives/parseuri [...]
Comment by Niall Smart on 16 July 2011:
If you need to go the other way (i.e., object spec to URI string), makeUri() can help:
https://gist.github.com/1073037
Comment by Yaffle on 18 July 2011:
javascript absolutize URL : https://gist.github.com/1088850
Pingback by jquery url parser????????? | ???????? on 18 July 2011:
[...] the??parser functionality is based on the?????regex parser by steven levithan???. [...]
Comment by Adam on 27 July 2011:
Can you remove the maxlength on the input box? Thanks.
Comment by 84 on 5 September 2011:
Hello…nice site…http://blog.stevenlevithan.com/archives/parseuri is The Best! Please keep it up webmaster….great job…thumbs up!
Comment by hepsignman on 3 October 2011:
Newbie question.
We have a job application form that uses the document.referrer to identify which job they are applying for. And, I want to add this info into the subject line of the email sent to our hr person. I came across your code that will parse the url.
how do I take what your parser produces so I can add it to the subject line?
Comment by sosoflickr on 26 December 2011:
great stuff.
thanks!
Comment by Norbert Klasen on 12 January 2012:
Hi,
I’ve enhanced the loose mode a bit to support and tokenize literal IPv4 and IPv6 addresses as well as splitting an FQDN into hostname and domain.
Thanks
Norbert
String.prototype.parseUri = function() {
var o = String.prototype.parseUri.options;
var m = o.parser.ipv6.exec(this);
var uri = {};
var i = 18;
while (i–)
uri[o.key[i]] = m[i] || “”;
uri[o.q.name] = {};
uri[o.key[16]].replace(o.q.parser, function($0, $1, $2) {
if ($1)
uri[o.q.name][$1] = $2;
});
if (uri.ipv4 != “”) {
uri.ip = uri.ipv4;
}
else
if (uri.ipv6 != “”) {
uri.ip = uri.ipv6;
}
return uri;
};
String.prototype.parseUri.options = {
// strictMode : false,
key : [ "source", "protocol", "authority", "userInfo", "user", "password",
"host", "ipv4", "ipv6", "basename", "domain", "port", "relative",
"path", "directory", "file", "query", "anchor" ],
q : {
name : “queryKey”,
parser : /(?:^|&)([^&=]*)=?([^&]*)/g
},
parser : {
ipv6 : /^(?:(?![^:@]+:[^:@\/]*@)([^[:\/?#.]+):)?(?:\/\/)?((?:(([^:@]*)(?::([^:@]*))?)?@)?((?:(\d+\.\d+\.\d+\.\d+)|\[([a-fA-F0-9:]+)\]|([^.:\/?#]*))(?:\.([^:\/?#]*))?)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/
}
};
Comment by Pablo Pazos on 11 February 2012:
Hi, great piece of code.
BTW, is there any function for doing the opposite operation? (I mean getting the string URL from the parsed one).
It could be useful when you need to change some URL params on the query string or any part of the URL. (that’s inded what I need to do on a CMS I’m building: http://code.google.com/p/yupp-cms/)
Thanks a lot!
Comment by Doug on 16 February 2012:
Steve,
I am trying to use your code with server side script in domino xpages. Everything works great except the parameter value returned is a number instead of the correct string. All other values are returned correctly
Using http://usr:[email protected]:81/dir/dir.2/index.htm?q1=0&&test1&test2=value#top
uri.queryKey.q1=0
uri.queryKey.test1=5
uri.queryKey.test2=11
Thanks for contributing.
Comment by Doug on 16 February 2012:
Steve,
FYI- I used a Domino function to grab from $0 everything to the right of the = sign instead of using $2 and it now returns the correct values. Thanks again for contributing. Great code.
Comment by Sean Bannister on 11 March 2012:
The semicolon at the end of the parseUri function isn’t required as mentioned in section 13 of the spec http://ecma262-5.com/ELS5_HTML.htm#Section_13
Comment by ?? ???? on 7 April 2012:
Nice post really , Thanks for sharing.
Comment by JAY on 9 April 2012:
Hi author i would like to ask how could I use this code as javascipt that could display incoming search term to my site. Its like when someone searches in google and arrives to my site, the url of the referrer is parsed and displayed in my website as ” incoming search term (keyword)” ? any suggestion about making it in javascipt?
Comment by Erik Dubbelboer on 17 April 2012:
Hi Steven,
As some others have pointed out already your code doesn’t work for urls with a @ in the path or query part.
For example http://www.adperium.com/campaigns/[email protected]/93f92b1c will return example.com as the host.
According to rfc 3986 section-3.3, @ is a valid path character.
Since adblock for chrome uses your code to parse urls it currently blocks parts of sites that shouldn’t be blocked.
Pingback by JavaScript or Query library to work with paths/URIs | Easy jQuery | Free Popular Tips Tricks Plugins API Javascript and Themes on 21 May 2012:
[...] a look at the parseURI function of Flagrant Badassery’s blog. It can parse any well formed URIs. Tagged: [...]
Comment by Francis Cagney on 9 July 2012:
This was just what I wanted except: I’m passing some javascript code with spaces and funny chars in the string. So I wrote a little extension to decode these:
So I replaced the line:
if ($1) uri[o.q.name][$1] = $2
with
if ($1) uri[o.q.name][$1] = $2.replace(o.q.decode, function ($3, $4) {
return ($4) ? String.fromCharCode (parseInt($4, 16)) : ” “;
});
and added this to the q structure.
decode: /(?:\+)|(?:%(..))/g
Comment by yonathan garti on 29 November 2012:
hello,
love your work!!!
found a bug: fail parsing this url
http://www.test.com/?email=[email protected]
the problem is the ‘@’ char in queryString value.
it just destroy it.
Comment by Steven Levithan on 10 December 2012:
@yonathan garti, the easiest way to deal with this is to URL-encode the @ sign as %40. You can see a few people including myself discussing this issue in earlier comments. I plan to correct this in future versions of the script, but haven’t gotten around to it yet. You could try simply changing each instance of
[^:@]to[^:@?]. That might do the trick, though I haven’t fully evaluated its impact.Comment by Radu Coravu on 8 February 2013:
Links like “mailto:[email protected]” are not properly parsed.
Pingback by 30 most useful jQuery plugins | Developer Drive on 4 March 2013:
[...] parameters, fragment parameters and more. The core parser functionality is based on the Regex URI parser by Steven Levithan, and the query string parsing is handled by a modified version [...]
Pingback by 30 mest användbara jQuery plugins | Appar till Apple on 11 March 2013:
[...] parametrar, parametrar fragment och mycket mer. Kärnan tolken funktionaliteten bygger på Regex URI parser av Steven Levithan , och frågesträngen parsningen hanteras av en modifierad version av nod-QueryString [...]
Comment by scutwukai on 21 March 2013:
it doesn’t work fine if a uri contains “@” character in parameter, just like this:
http://aaa.bbb.com/index?email=[email protected]
Comment by Yannick Albert on 23 March 2013:
What about
data-uri‘s?Comment by Joshua Logsdon on 10 April 2013:
I needed to get the file name and extension so I added this before returning the info in the function.
var file_pieces = uri[o.key[11]].split(‘.’);
if ( file_pieces.length > 1 )
{
uri.file_ext = file_pieces[file_pieces.length - 1];
file_pieces.splice(file_pieces.length – 1, 1);
}
uri.file_name = file_pieces.join(‘.’);
Comment by Ramesh Krishnan on 29 April 2013:
Hi Steven,
I have tried replacing [^:@] to [^:@?] in both strict and loose mode and i was able to parse the URL containing @ correctly.
Thanks for the script.
–Ramesh
Comment by website on 16 May 2013:
Howdy! Do you use Twitter? I’d like to follow you if that would be okay. I’m definitely
enjoying your blog and look forward to new updates.
Comment by Ashlee on 17 May 2013:
Thanks , I’ve just been searching for info approximately this subject for ages and yours is the greatest I have found out till now. However, what concerning the conclusion? Are you positive about the source?
Comment by Lavonne on 18 May 2013:
I take pleasure in, result in I discovered exactly what I was having a look for.
You’ve ended my 4 day long hunt! God Bless you man. Have a great day. Bye