![]() |
|
Don't ask to ask, just ask | |
PerlMonks |
More sexegerby darksym (Beadle) |
on Apr 23, 2002 at 14:15 UTC ( #161313=perlmeditation: print w/replies, xml ) | Need Help?? |
OK, I think far too little attention has been devoted to the
sexeger(tm) (coined just a while back, it was recently
trademarked by a numbered company out of New York in 2001 as
applied to computer software -- though I'm not sure who would
try to market a product with a name like that! ;) ... Also
a patent is pending in the United States -- what's confusing
is the application talks about "artistic license" and has a
strange demonic looking camel stamp in the applicant space -
odd really.) As is explained on an unnamed saints website and previously brought to the attention of the monks, sexeger are primarily useful as a way to increase the speed of regular expression matches. It is demonstrated that a speed increase of many orders of magnitude is possible through the proper application of sexeger.
However, others may simply find it useful, in cases where
obfuscation is desired or they wish to make their code
even less maintainable, a sort of perl programming poison
pill strategy. It's easy to demonstrate your mastery by
listing your code backward, but not _exactly_ backward.
See in sexeger speak, regular expressions preserve their
grouping and other logical elements while reversing the
strings in each element and where appropriate ordering them
likewise in reverse. An inappropriate reversal is the
character class. Due to their commutative-like properties,
there is no need to reverse these classes. A better
obfuscation strategy would be to take reoccuring classes and
randomly permute their elements so as to make them harder to
decipher at first glance and/or break out into sub-expressions. Now as explained, there are many cases where sexegers are apropos and helpful, however it has also come to my attention that there may similarly be cases, where employing them would benefit the cause of obfuscation more so than any performance increase:
One can analyse the above matches like follows: It can be said that each element in a match group with sub groups is a root or lowest common factor of the group. Thus the match knows if it finds this common root of the sub-group it has also found one of the match elements of the sub-group -- and if not, it can discard these from the solution set immediately. The less the amount of such factors at the base level of a match group, usually the better the performance. An equation might summarise the total time required to perform a match. And I'll leave this as an open challenge, as I haven't perfected this to a science yet. So far, I've just been doing trial and error benchmarking using the well-known Benchmark by Jarkko Hietaniemi and using my common sense knowledge (I'm no master regexer) of the regex engine. Also consider:
In the above it is obvious that the latter is a good sexeger due to it's suffix heavy commonality. Remember that in a sexeger a prefix becomes a suffix and a suffix a prefix. So far there is no way to automatically generate all sexeger given the regexes wishing to be transformed. However on the site above, the unnamed saint, has indicated work is in progress on just such a tool. Until then for complex regexes, hand reversal can prove to be both instructional and fun. Usually it takes a very small amount of time to do, once you can get over the initial disorientation and accidental typing of (:?) instead of (?:). My strategy is to do like follows now:
Yes indeed, it is all very fun. I could do this for hours on end, and make a day out of it. Joy! ... umm.. But I just had to cheat. :) As the above code blurbs might have you gather, I have been using a helper script to create these different forms of the same search list. As a new way to create word search regexes employing segexer, I'll show expressions like these can be automatically generated like so: from helper.pl:
This makes use of Regex::PreSuf ALSO by the Finnish perl hacker Jarkko Hietaniemi <jhi@iki.fi>. Regex compression is a new way of looking at multiword searching -- instead of iterating over a list -- try using an optimal regex match for the word list. You may be pleasantly surprised by the results. Just wait a sEcond though: And now I'll give credit where it's due. Only one Person is insane enough to come up with such an intentionally conFusing way of doing things... He is the bringer of obFuscated code, short perl quips, and eye-straining regexes. Yes that's right, sexeger was coined by, this Perl hacker: ... well you can probably guess who it is by now. --darksym Edit by dws to add <readmore> tag
Back to
Meditations
|
|