http://qs321.pair.com?node_id=558868

In the Web and in different books everybody can find smth like "connon regex lists" containig dozens of reallife regular expressions simple enough to be understood by everybody.

Being quite simple, these regexes are generally used to solve routine problems every Perl programmer meet.

But there are some tasks, which require regular expressions of much more complexity.

I want to make a list of complicated (obfuscated, odd etc.) regular expressions used to solve diffucult real problems (and then i plan to make it availible online somewhere outside this thread :) ). I will be very obliged if you post here examples of your most interesting regexes combined with chunks of data they were intended to match against.

My own favourite (it is combined from two regexes, one of which is recursive):

$brackets_pattern = qr{ # recursive pattern to search brackets lik +e [mmm[ hh[f]]ll] \[ (?: (?>[^\[\]]+ ) # non-brackets | (??{$brackets_pattern}) #new pattern for inside brackets )* \] }x; my $pat = qr/(?-xism:(?-xism:[ab?x][DLSRX?]Glc(?:[pfa?]|-ol|)N\(1\-4\) +)(?x-ism:\[(?:(?>[^\[\]]+)|(??{$brackets_pattern}))*\])*(?x-ism:\[(?: +(?>[^\[\]]+)|(??{$brackets_pattern}))*\])*\[(?x-ism:(?:(?>[^\[\]]+?)| +(??{$brackets_pattern}))*)(?:t\)|(?<![\])]))(?x-ism:\[(?:(?>[^\[\]]+) +|(??{$brackets_pattern}))*\])*(?-xism:[ab?x][DLSRX?]Glcp\(1\-6\))\](? +x-ism:\[(?:(?>[^\[\]]+)|(??{$brackets_pattern}))*\])*(?-xism:[ab?x][D +LSRX?]GalpN(?=\(|$)))/;
Of course, i didn't type the second regex myself; it is generated by my substructure search engine for the Bacterial Carbohydrate Structure Database as a response to a usual request. That's why i used the word "created" instead of "wrote" in the title — some interesting regexes are never typed, but are used intensively :)
Sample data to match against:
-6)[xR3HOBut(1-3)]aDGlcpN(1-4)[aDGlcp(1-6),Ac(1-2)]aDGalpN(1-3)[Ac(1-2 +)]bDGalpN(1-2)aDGlcp(1-P-

I hope to see yor examples described in the way i described mine :)

UPDATE:

Short list of the IMHO best ones i found in replies (ordered by time the comment was posted):


([^e]|e([^s]|s([^\.]|\.([^c]|c([^o]|o([^m]|m([^p]|p([^\.]|\.([^o]|o([^ +s]|s([^\.]|\.([^l]|l([^i]|i([^n]|n([^u]|u[^x])))))))))))))))
by Hue-Bond
A simple grep -E regex aimed to determine cross-posts between certain newsgroups.


#!/usr/bin/perl -l "AB~ACFI~ADGJ~AE~BCDE~BFHJ~BI~EGHI~EJ~IJ" =~ /([^~])[^~]*([^~]).*~[^~] +*([^~])[^~]*([^~])(?{local$z=$1 and local$y=$2 and local$x=$1 eq$3?$4 +:$1 eq$4?$3:($z=$2)&&($y=$1)&&$2 eq$3?$4:$2 eq$4?$3:0}).*~[^~]*((??{$ +y})[^~]*(??{$x})|(??{$x})[^~]*(??{$y}))(?{$x{join" - ",sort$x,$y,$z}+ ++})(?!)/; print for sort(keys %x), keys(%x) . " triangles found";
by !1 The regex (be stricter, this mix of regex and perl code ;)) in the heart of this short script finds all triangles for this quest and puts them all in the %x hash.
URL matching RegEx by abigail
Author's comment:
This does only a subset of the possible URLs:
I had to put it under <spoiler> because of its length :)
forking regular expression by Ovid As this is a complete Perl script (the forking regex standalone has no sense), i have put it under spoiler too.
An abridged (due to incredible size of the original) version of ikegami's generated regex to solve Sudoku puzzles:
The regexes become stranger and stranger :) Whose will be the next? ;)