REMatch (ColdFusion)
Following are some UDFs I wrote recently to make using regexes in ColdFusion a bit easier. The biggest deal here is my reMatch()
function.
reMatch()
, in its most basic usage, is similar to JavaScript's String.prototype.match()
method. Compare getting the first number in a string using reMatch()
vs. built-in ColdFusion functions:
- reMatch:
<cfset num = reMatch("\d+", string) />
- reReplace:
<cfset num = reReplace(string, "\D*(\d+).*", "\1") />
- reFind:
<cfset match = reFind("\d+", string, 1, TRUE) />
<cfset num = mid(string, match.pos[1], match.len[1]) />
All of the above would return the same result, unless a number wasn't found in the string, in which case the reFind()
-based method would throw an error since the mid()
function would be passed a start
value of 0. I think it's pretty clear from the above which approach is easiest to use for a situation like this, and it would be easy to envision scenarios where this functionality could more drastically improve code brevity.
Still, that's just the beginning of what reMatch()
can do. Change the scope
argument from the default of "ONE" to "ALL" (to follow the convention used by reReplace()
, etc.), and the function will return an array of all matches. Finally, set the returnLenPos
argument to TRUE and the function will return either a struct or array of structs (based on the value of scope
) containing the len, pos, AND value of each match. This is very different from how the returnSubExpressions
argument of reFind()
works. When using returnSubExpressions
, you get back a struct containing arrays of the len and pos (but not value) of each backreference from the first match.
Here's the code, with four additional UDFs (reMatchNoCase()
, match()
, matchNoCase()
, and reEscape()
) added for good measure:
See the demo and get the source code.
Now that I've got a deeply featured match function, all I need Adobe to add to ColdFusion in the way to regex support is lookbehinds, atomic groups, possessive quantifiers, conditionals, balancing groups, etc., etc.…
Comment by Anonymous on 20 May 2007:
Hey Steve,
After playing with your REMatch method (which has helped me more than once) I made a little change. I added ‘SUB’ to the scope argument, which will loop over each match and return sub-matches. You can read all about it here. I don’t have code snippets in the blog yet, but there is a download available. If you have any suggestions please let me know.
Comment by Andrew Duckett on 20 May 2007:
Didn’t mean to be Anon on that last one!
Comment by Steve on 21 May 2007:
Andrew,
Glad to hear this helped you. That is a potentially very useful modification.
By the way, when I wrote this, I wasn’t aware that you could use underlying Java regex methods in ColdFusion. If I ever get around to releasing an updated version of REMatch and Adobe doesn’t include something similar natively in CF8, I’ll use the Java methods, which offer better performance and more powerful regular expression syntax (e.g., lookbehind). That would be my main suggestion for your CFC… use Java.
Thanks for posting!
Comment by Andrew on 22 May 2007:
Hey Steve, me again. I gave the java.util.regex package a shot, and was able to get a basic version working. Check it out here
Comment by Boyan on 4 June 2007:
Steve,
there seems to be a bug with your rematch function. When trying to do an http request to google for some search results from imdb (sample url is http ://www.google.com/search?q=imdb+Police+Academy&ie=utf-8 &oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a).
and then using your function to match the found links from IDMB:
this returns the right number of matches (2) but both at the same position and with the same length. I tried the same thing (with scope “all”) with Andrew’s mod and his works as expected. Just FYI.
Comment by Steve on 9 June 2007:
Thanks for the report. Does it work as you expect in Andrew’s CF version, Java-based version, or both? I’ll have to look at this later, but for now you might want to use the mod if it’s working correctly.
Comment by Todd Sharp on 24 August 2007:
Hey Boyan, you do know that reMatch is in CF 8, don’t you?
Comment by Steve on 25 August 2007:
Todd, I’m sure Boyan is aware of it now. I believe his comment was posted before the CF8 beta was available.
Comment by boybles on 23 January 2008:
I get the following error when I try to implement it:
“The names of user-defined functions cannot be the same as built-in ColdFusion functions.
The name reMatch is the name of a built-in ColdFusion function.
The CFML compiler was processing:
A cffunction tag beginning on line 1, column 2.
The error occurred in C:\Inetpub\wwwroot\test\regex3.cfm: line 1
1 : <cffunction name=”reMatch” output=”false”>
2 : <cfargument name=”regex” type=”string” required=”yes” />
3 : <cfargument name=”string” type=”string” required=”yes” />
”
Any ideas?
Comment by Steve on 23 January 2008:
That means you’re using ColdFusion 8, which includes a (much less flexible)
reMatch
function natively. I posted this well before any official word about CF8. Does the nativereMatch
not work for your needs?