How many times have you needed to run multiple replacement operations on the same string? It's not too bad, but can get a bit tedious if you write code like this a lot.
str = str.
replace( /&(?!#?\w+;)/g , '&' ).
replace( /"([^"]*)"/g , '“$1”' ).
replace( /</g , '<' ).
replace( />/g , '>' ).
replace( /…/g , '…' ).
replace( /“/g , '“' ).
replace( /”/g , '”' ).
replace( /‘/g , '‘' ).
replace( /’/g , '’' ).
replace( /—/g , '—' ).
replace( /–/g , '–' );
A common trick to shorten such code is to look up replacement values using an object as a hash table. Here's a simple implementation of this.
var hash = {
'<' : '<' ,
'>' : '>' ,
'…' : '…',
'“' : '“' ,
'”' : '”' ,
'‘' : '‘' ,
'’' : '’' ,
'—' : '—' ,
'–' : '–'
};
str = str.
replace( /&(?!#?\w+;)/g , '&' ).
replace( /"([^"]*)"/g , '“$1”' ).
replace( /[<>…“”‘’—–]/g , function ( $0 ) {
return hash[ $0 ];
});
However, this approach has some limitations.
- Search patterns are repeated in the hash table and the regular expression character class.
- Both the search and replacement are limited to plain text. That's why the first and second replacements had to remain separate in the above code. The first replacement used a regex search pattern, and the second used a backreference in the replacement text.
- Replacements don't cascade. This is another reason why the second replacement operation had to remain separate. I want text like
"this"
to first be replaced with “this”
, and eventually end up as “this”
.
- It doesn't work in Safari 2.x and other old browsers that don't support using functions to generate replacement text.
With a few lines of String.prototype
sugar, you can deal with all of these issues.
String.prototype.multiReplace = function ( hash ) {
var str = this, key;
for ( key in hash ) {
str = str.replace( new RegExp( key, 'g' ), hash[ key ] );
}
return str;
};
Now you can use code like this:
str = str.multiReplace({
'&(?!#?\\w+;)' : '&' ,
'"([^"]*)"' : '“$1”' ,
'<' : '<' ,
'>' : '>' ,
'…' : '…',
'“' : '“' ,
'”' : '”' ,
'‘' : '‘' ,
'’' : '’' ,
'—' : '—' ,
'–' : '–'
});
If you care about the order of replacements, you should be aware that the current JavaScript specification does not require a particular enumeration order when looping over object properties with for..in
. However, recent versions of the big four browsers (IE, Firefox, Safari, Opera) all use insertion order, which allows this to work as described (from top to bottom). ECMAScript 4 proposals indicate that the insertion-order convention will be formally codified in that standard.
If you need to worry about rogue properties that show up when people mess with Object.prototype, you can update the code as follows:
String.prototype.multiReplace = function ( hash ) {
var str = this, key;
for ( key in hash ) {
if ( Object.prototype.hasOwnProperty.call( hash, key ) ) {
str = str.replace( new RegExp( key, 'g' ), hash[ key ] );
}
}
return str;
};
Calling the hasOwnProperty
method on Object.prototype
rather than on the hash
object directly allows this method to work even when you're searching for the string "hasOwnProperty".
Lemme know if you think this is useful.