Regexes suck at math. To a regex engine, the characters `0`

through `9`

are no more special than any others.

I should mention that there are a couple exceptions. Perl and PCRE allow dynamic code to be run at any point during the matching process, which presents a great deal of extra potential. Perl does this with code embedded in the regex; PCRE with callouts to external functions. But those regex flavors are the exceptions, and even with them the extended capabilities only take you so far. Generally, math-related problems such as matching numeric ranges (useful for tasks like matching a set of years within longer text) are a pain in the a** to deal with, when they're possible at all.

However, the power and expressiveness of even basic regular expression syntax can lead to some nifty tricks. Things like matching only non-prime-length strings! The primality regex is somewhat famous now, but another hack that might surprise you is using regexes to solve a simple class of linear equations. I stumbled on the idea for this pattern while messing around with RegexBuddy's awesome debugger. The implementation itself is dead simple and should work pretty much universally, with the exception of *strict* POSIX ERE implementations or other esoteric flavors which don't allow backreferences. Here's the template:

**^****(****.*****)****\1{***A−1***}****(****.*****)****\2{***B−1***}$**

That lets us solve for `x`

and `y`

with an equation like `17x + 12y = 51`

. `A`

and `B`

are placeholders for constants that in this case are `17`

and `12`

. So, the regex becomes

. We subtract one from values **^****(****.*****)****\1{16}****(****.*****)****\2{11}$**`A`

and `B`

because we're repeating backreferences to subpatterns that have already matched once before. If you run that regex against a 51-character string, the length of `$1`

(backreference one) will be `3`

(which tells us that `x = 3`

), and the length of `$2`

(backreference two) will be `0`

(meaning that `y = 0`

). Indeed, `17×3 + 12×0 = 51`

. If the problem is unsolvable, the regex will not match the string. If there are multiple possible solutions, the one with the highest value of `x`

will be returned since `x`

is handled earlier in the regex.

Try it out. You can use as many variables as you'd like as long as the equation follows the same form. E.g., `11x + 2y + 5z = 115`

can be solved with

and a 115-character subject string (the result: **^****(****.*****)****\1{10}****(****.*****)****\2{1}****(****.*****)****\3{4}$**`11×10 + 2×0 + 5×1 = 115`

). Run

against a 247-character string and you'll get back a 19-character value for backreference one, demonstrating that **^****(****.*****)****\1{12}$**`13×19 = 247`

. Keep in mind that as the integers and string lengths get higher and the number of variables increase, the amount of backtracking by the regex engine will also increase. At some threshold this pattern will slow to the point that it's unusable. But I don't really care; it's still cool.

Cool. Just to be anal: you have *one* solution of many for the 17x + 12y = 51; ðŸ˜€

You need x number of equations for y number of variables. I’d go into the details, but it’s 3AM and I have a deadline at 3pm! oo noes.

Great article.

Cool. I wrote a python function to do this based on your method:

http://xix.org/misc/solve.py

To match all solutions in Perl:

local our @solutions;

/

…regex from article…

(?{ push @solutions, [$1,$2,$3] })

(?!) # Force backtracking.

/x

Have hammer. Seems like a nail. *Swing*

This seems similar (but more complex): Diophantine Equation Solver. I’m not very familiar with Perl though (aside from its regex flavor), and haven’t delved into the code.

This is a good idea.Thank you.

Cool. But really slow.

Hi… I was looking for something like it *-* for a schoolar work

I,ll test it =)

Ty

matrix???

exactly , almost i gust use Algebraic Equations or matrix to get RegExp when coding