<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Remove Nested Patterns with One Line of JavaScript</title>
	<atom:link href="http://blog.stevenlevithan.com/archives/reverse-recursive-pattern/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.stevenlevithan.com/archives/reverse-recursive-pattern</link>
	<description>A JavaScript and regular expression centric blog</description>
	<lastBuildDate>Thu, 09 Feb 2012 10:18:35 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Henry Ford</title>
		<link>http://blog.stevenlevithan.com/archives/reverse-recursive-pattern/comment-page-1#comment-130973</link>
		<dc:creator>Henry Ford</dc:creator>
		<pubDate>Tue, 30 Aug 2011 09:08:13 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stevenlevithan.com/archives/reverse-recursive-pattern#comment-130973</guid>
		<description>s/loop&#039;s/loops</description>
		<content:encoded><![CDATA[<p>s/loop&#8217;s/loops</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: marty</title>
		<link>http://blog.stevenlevithan.com/archives/reverse-recursive-pattern/comment-page-1#comment-121084</link>
		<dc:creator>marty</dc:creator>
		<pubDate>Sat, 30 Jul 2011 20:56:59 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stevenlevithan.com/archives/reverse-recursive-pattern#comment-121084</guid>
		<description>Dear Steven Levithan,

Grace to this powerful line of code, I have been able to build the kernel of my wiki parser engine (http://epsilonwiki.free.fr) as explained in this page : http://epsilonwiki.free.fr/index.php?view=parser .

Thank you a lot.
Alain Marty</description>
		<content:encoded><![CDATA[<p>Dear Steven Levithan,</p>
<p>Grace to this powerful line of code, I have been able to build the kernel of my wiki parser engine (<a href="http://epsilonwiki.free.fr" rel="nofollow">http://epsilonwiki.free.fr</a>) as explained in this page : <a href="http://epsilonwiki.free.fr/index.php?view=parser" rel="nofollow">http://epsilonwiki.free.fr/index.php?view=parser</a> .</p>
<p>Thank you a lot.<br />
Alain Marty</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: patorjk.com &#187; Extendible BBCode Parser in JavaScript</title>
		<link>http://blog.stevenlevithan.com/archives/reverse-recursive-pattern/comment-page-1#comment-100443</link>
		<dc:creator>patorjk.com &#187; Extendible BBCode Parser in JavaScript</dc:creator>
		<pubDate>Sat, 07 May 2011 19:58:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stevenlevithan.com/archives/reverse-recursive-pattern#comment-100443</guid>
		<description>[...] [*] tag (which doesn&#8217;t have a closing tag). Luckily, I came across a neat blog post on finding nested patterns in JavaScript, which came in handy for isolating tag pairs, from the inner-most on up. Taking the idea from that [...]</description>
		<content:encoded><![CDATA[<p>[...] [*] tag (which doesn&#8217;t have a closing tag). Luckily, I came across a neat blog post on finding nested patterns in JavaScript, which came in handy for isolating tag pairs, from the inner-most on up. Taking the idea from that [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roverandom</title>
		<link>http://blog.stevenlevithan.com/archives/reverse-recursive-pattern/comment-page-1#comment-65651</link>
		<dc:creator>Roverandom</dc:creator>
		<pubDate>Thu, 02 Dec 2010 13:49:29 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stevenlevithan.com/archives/reverse-recursive-pattern#comment-65651</guid>
		<description>Hi,

I would like to create some regular expression to highlight syntax code. In my code, the variables are defined inside brackets [var]. (I can put variables inside another ones, [var[i]])

I have some problems with nested brackets.

I&#039;m using /\[(.*?)\]/g  but for [[%1]stack[i]] , &#039;stack&#039; and the last bracket is not coloured.

Any help will be appreciated

Thanks in advanced</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>I would like to create some regular expression to highlight syntax code. In my code, the variables are defined inside brackets [var]. (I can put variables inside another ones, [var[i]])</p>
<p>I have some problems with nested brackets.</p>
<p>I&#8217;m using /\[(.*?)\]/g  but for [[%1]stack[i]] , &#8216;stack&#8217; and the last bracket is not coloured.</p>
<p>Any help will be appreciated</p>
<p>Thanks in advanced</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ridgerunner</title>
		<link>http://blog.stevenlevithan.com/archives/reverse-recursive-pattern/comment-page-1#comment-58701</link>
		<dc:creator>ridgerunner</dc:creator>
		<pubDate>Sat, 28 Aug 2010 21:17:27 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stevenlevithan.com/archives/reverse-recursive-pattern#comment-58701</guid>
		<description>Concerning the case where the delimiters are more than one char...
The efficiency of these regexes can be improved (at the expense of a bit more complexity), by implementing Jeffrey Friedl&#039;s &lt;em&gt;unrolling-the-loop&lt;/em&gt; technique. The following are improved versions written in /Javascript/ syntax:

Unrolling Steve&#039;s &lt;&lt;example&gt;&gt; regex...
&lt;code&gt;/&lt;&lt;[^&lt;&gt;]*(?:(?:(?!&lt;&lt;)&lt;&#124;(?!&gt;&gt;)&gt;)[^&lt;&gt;]*)*&gt;&gt;/
&lt;/code&gt;
Unrolling Scott&#039;s &lt;tag&gt;example&lt;/tag&gt; regex...
&lt;code&gt;while (str != (str = str.replace(/&lt;(\w+)\b[^&gt;]*&gt;([^&lt;]*(?:(?!&lt;\/?\1\b)&lt;[^&lt;]*)*)&lt;\/\1&gt;/g, &quot;$2&quot;)));&lt;/code&gt;

But more importantly, these fix &lt;em&gt;a rather nasty PHP bug...&lt;/em&gt;
The regex constructs recommended in the original post (and the ones from the comments which follow), lead directly to segmentation-fault problems when parsing semi-longish strings with PHP. This type of regex construct results in a lot of recursive calls in the PCRE library which quickly results in a stack overflow (and segmentation fault) when applied to a long subject text. And the length of subject text which causes the overflow is directly related to the stack size of the executable linked with the PCRE library: typically PHP.EXE (the command line utility) or HTTPD.EXE (Apache web server) on a Windows box. PHP.EXE is typically built with an 8MB stack and HTTPD.EXE is built with a 256KB stack. My testing reveals that Scott&#039;s HTML tag matching regex and Steve&#039;s improved version both generate a stack overflow and seg-fault when applied to a mere 673 byte subject string when the PHP executable has a 256KB stack (i.e. Apache running on Windows). When the PHP executable has an 8MB stack executable (which is the typical value for Apache on Linux boxes), these regexes blow up with a 22999 byte string. And Steve&#039;s original &lt;&lt;example&gt;&gt; regex also blows up with similarly sized strings. The improved regexes I provided above do not suffer this limitation and happily work with subject strings up to 26MB regardless of the executable stack size.

The Drupal community recently wrangled with this problem (which is how I became intimately familiar with the issue) - you can read about that here: &lt;a href=&quot;http://drupal.org/node/444228&quot; rel=&quot;nofollow&quot;&gt;Optimize CSS option causes php cgi to segfault in pcre function &quot;match&quot;&lt;/a&gt;. For details on the PCRE stack usage issue see: &lt;a href=&quot;http://www.manpagez.com/man/3/pcrestack/&quot; rel=&quot;nofollow&quot;&gt;PCRE DISCUSSION OF STACK USAGE&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>Concerning the case where the delimiters are more than one char&#8230;<br />
The efficiency of these regexes can be improved (at the expense of a bit more complexity), by implementing Jeffrey Friedl&#8217;s <em>unrolling-the-loop</em> technique. The following are improved versions written in /Javascript/ syntax:</p>
<p>Unrolling Steve&#8217;s &lt;&lt;example&gt;&gt; regex&#8230;<br />
<code>/&lt;&lt;[^&lt;&gt;]*(?:(?:(?!&lt;&lt;)&lt;|(?!&gt;&gt;)&gt;)[^&lt;&gt;]*)*&gt;&gt;/<br />
</code><br />
Unrolling Scott&#8217;s &lt;tag&gt;example&lt;/tag&gt; regex&#8230;<br />
<code>while (str != (str = str.replace(/&lt;(\w+)\b[^&gt;]*&gt;([^&lt;]*(?:(?!&lt;\/?\1\b)&lt;[^&lt;]*)*)&lt;\/\1&gt;/g, "$2")));</code></p>
<p>But more importantly, these fix <em>a rather nasty PHP bug&#8230;</em><br />
The regex constructs recommended in the original post (and the ones from the comments which follow), lead directly to segmentation-fault problems when parsing semi-longish strings with PHP. This type of regex construct results in a lot of recursive calls in the PCRE library which quickly results in a stack overflow (and segmentation fault) when applied to a long subject text. And the length of subject text which causes the overflow is directly related to the stack size of the executable linked with the PCRE library: typically PHP.EXE (the command line utility) or HTTPD.EXE (Apache web server) on a Windows box. PHP.EXE is typically built with an 8MB stack and HTTPD.EXE is built with a 256KB stack. My testing reveals that Scott&#8217;s HTML tag matching regex and Steve&#8217;s improved version both generate a stack overflow and seg-fault when applied to a mere 673 byte subject string when the PHP executable has a 256KB stack (i.e. Apache running on Windows). When the PHP executable has an 8MB stack executable (which is the typical value for Apache on Linux boxes), these regexes blow up with a 22999 byte string. And Steve&#8217;s original &lt;&lt;example&gt;&gt; regex also blows up with similarly sized strings. The improved regexes I provided above do not suffer this limitation and happily work with subject strings up to 26MB regardless of the executable stack size.</p>
<p>The Drupal community recently wrangled with this problem (which is how I became intimately familiar with the issue) &#8211; you can read about that here: <a href="http://drupal.org/node/444228" rel="nofollow">Optimize CSS option causes php cgi to segfault in pcre function &#8220;match&#8221;</a>. For details on the PCRE stack usage issue see: <a href="http://www.manpagez.com/man/3/pcrestack/" rel="nofollow">PCRE DISCUSSION OF STACK USAGE</a>.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

