Flagrant Badassery

A JavaScript and regular expression centric blog

RSS Feed for UnicodeUnicode

XRegExp Updates

A few days ago, I posted a long-overdue XRegExp bug fix release (version 1.5.1). This was mainly to address an IE issue that a number of people have written to me and blogged about. Specifically, RegExp.prototype.exec no longer throws an error in IE when it is simultaneously provided a nonstring argument and called on a […]

Read More

Unicode Plugin for XRegExp

Update: Many of the details described below are now out of date. Get the latest version of the Unicode plugin for XRegExp. I've released a simple plugin for XRegExp (my JavaScript regex library) that adds support for Unicode properties and blocks to JavaScript regular expressions. It uses the Unicode 5.1 character database, which is the […]

Read More

JavaScript, Regex, and Unicode

Not all shorthand character classes and other JavaScript regex syntax is Unicode-aware. In some cases it can be important to know exactly what certain tokens match, and that's what this post will explore. According to ECMA-262 3rd Edition, \s, \S, ., ^, and $ use Unicode-based interpretations of whitespace and newline, while \d, \D, \w, […]

Read More