Re: Whats your favorite nonstandard regex quote char?
by broquaint (Abbot) on Apr 22, 2003 at 15:36 UTC
|
So what are the monks at large opinion on this ever so trivial a subject? Which alternate regex delimiter do you favour? And what are the arguments behind your opinion (if any)?
My quote character of choice when subsituting is the curly brace pair e.g
s{ ( \w+ ) [ ] ( \w+ ) }($2 $1)xg;
And for anything but the simplest of regexes (that can't be fielded off to the likes of index()) the /x modifier is a must. I also use parentheses as the replace part of a substitute.
My reasoning is that curly braces look like a block and parentheses look like execution (or at least that's how it mnemonically maps in my head).
HTH
_________ broquaint | [reply] [d/l] |
Re: Whats your favorite nonstandard regex quote char?
by The Mad Hatter (Priest) on Apr 22, 2003 at 15:34 UTC
|
I tend to favor either ! or |. Occasionally I'll use ' as well. As for a reason, I dunno...it looks good.
s'{.*?}'{$sub[$n++]}'g;
s|{.*?}|{$sub[$n++]}|g;
| [reply] [d/l] |
|
$s ='25 {fred and barney} text 2.36 12.0 {bam bam} text {pebbles}';
$n=0; $s =~ s'{.*?}'{$sub[$n++]}'g;
print $s;
25 {$sub[$n++]} text 2.36 12.0 {$sub[$n++]} text {$sub[$n++]}
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] [d/l] |
|
Ooh! Thank you for pointing out that caveat. It will no doubt someday save me hours of debugging time. ;-)
| [reply] |
|
| [reply] |
|
It just looks clean to me, especially since it spans the height of the whole line. In my editor, it stands out quite nicely, so that isn't a problem for me. If it does start to get confusing though, I will use a different character.
| [reply] |
Re: Whats your favorite nonstandard regex quote char?
by Mr. Muskrat (Canon) on Apr 22, 2003 at 15:39 UTC
|
It all depends on what I'm doing... If it's a script that I'll use more than once, I tend to lean towards readability. If it's an obfuscation, I go for confusion of course.
For what should be obvious reasons, these are some of my personal favorites:
s@{.*?}@{$sub[$n++]}@g;
s${.*?}${$sub[$n++]}$g;
s%{.*?}%{$sub[$n++]}%g;
s={.*?}={$sub[$n++]}=g;
| [reply] [d/l] [select] |
|
s;{.*?};{$sub[$n++]};g;
:-)
---
demerphq
<Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
| [reply] [d/l] [select] |
|
's' is a fun delimiter, especially with a /s modifier and some variables named 's' for good measure. :)
ss{.*?}s{$s[$s++]}ss;
-Matt | [reply] [d/l] |
|
|
|
I wanted to leave some for the imagination :)
| [reply] |
|
Yes I can see those being favorites for obfuscation - in general work its likely better to avoid the use of overly meaningful characters. I'd include sigils at the top of that list, then other things that are likely to confuse (like the pound character). I waver between the various bracketed characters - curly braces, square braces, chevrons and in last place parentheses. Which one I actually use is predicated on what the content of the expression is and whether it conflicts with my delimiters.
| [reply] |
|
Which one I actually use is predicated on what the content of the expression is and whether it conflicts with my delimiters.
Naturally. But assuming that the only character that is out of bounds is the / which would you use and why?
---
demerphq
<Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
| [reply] [d/l] |
|
s\{.*?}\{$sub[$n++]}\g;
*evil grin*
Makeshifts last the longest. | [reply] [d/l] |
Re: Whats your favorite nonstandard regex quote char?
by LAI (Hermit) on Apr 22, 2003 at 16:20 UTC
|
s|{.*?}|{$sub[$n++]}|g;
It has good visibility, is simple, yaddax3. I toyed around with using #, and it works well as a visual cue, because it registers as 'dark' among visually 'lighter' characters. But accidental obfuscation, as well as inadequate editor syntax hilighting, made me drop that.
I'm not especially fond of ,.`"', because I like for the whole height of the line to be covered. Makes the character look more like a delimiter. Paired delimiters such as []{}()<> do that, but I can't get used to them. They make a lot of sense, and can make certain regexps nice and tidy... but even on multiline things I like my pipe:
s|
{.*?}
|
{$sub[$n++]}
|gx;
LAI
__END__ | [reply] [d/l] [select] |
|
$s =~ s>{.*?}
>{$sub[$n++]}>gx;
$s =~ s){.*?}
){$sub[$n++]})gx;
Personally I tend to use ! because of the rarity of having to match a ! in a string, or having a negative look-ahead/behind assertions. I do like the way the pipe looks but tend to avoid it because it is the character for alternation. -enlil | [reply] [d/l] [select] |
|
$s =~ s>{.*?}
>{$sub[$n++]}
>gx;
...but TIMTOWTDI. And as for the pipe being the character for alternation... you're right of course. But not many characters are used infrequently enough that they make good delimiters. So I just use whichever I feel looks good until there's a conflict with the content of the regex.
LAI
__END__ | [reply] [d/l] [select] |
|
I like using paired delimiters in this context, usually curlies.
Speaking of quote delimiters, there's a lovely piece of code in Perl_yylex() (in toke.c) to find the matching balanced delimiter of a quoted string:
if (term && (tmps = strchr("([{< )]}> )]}>",term)))
term = tmps[5];
Pop quiz: why does this work? Why are the closing delimiters repeated twice?
_____________________________________________ Come to YAPC::Europe 2003 in Paris, 23-25 July 2003.
| [reply] [d/l] |
Re: Whats your favorite nonstandard regex quote char?
by Abigail-II (Bishop) on Apr 22, 2003 at 20:36 UTC
|
I prefer my delimiters to be tall and skinny, just like
/. | is tall and skinny, but that
leads often to a clash with the special regex character.
So, I usually use !, which only gives a problem
if you want to use (?!) or (?<!),
but they are uncommon enough for ! to be useful.
And I often use the balanced delimiters, the four sets of
braces ({ } being my favourites). !
I seldom use for matching, only for substitutions, or any of
the q* operators, but I prefer /
or { } for them.
Only in one liners, vi or IRC, I sometimes use a period when
doing substitution, but never in code that's stored in a file.
I don't like to use #, @ or similar
characters as delimiters. I know they are popular, but they
are too black to my taste, and draw the eye away from the
important regex towards the unimportant delimiters.
Abigail | [reply] [d/l] [select] |
Re: Whats your favorite nonstandard regex quote char?
by jmcnamara (Monsignor) on Apr 22, 2003 at 16:09 UTC
|
The following is also possible (if not entirely legible):
s g{.*?}g{$sub[$n++]}gg;
--
John.
| [reply] [d/l] |
Re: Whats your favorite nonstandard regex quote char?
by BrowserUk (Patriarch) on Apr 22, 2003 at 21:30 UTC
|
I almost universally use s[...][...]g; for regexes the exception being when I'm using the /e modifier in which case I tend to use s[...]{....}ge; as the curlies give a visual indication that the right-hand side is active rather than passive. In this specific case, the same regex in my test code is coded as
s[[{].*?[}]][{$sub[$n++]}]g;
which (I think) looks fine under the syntax highlighting in my editor, but looked altogether too confusing when rendered in the b&w of the preview page. So I looked for some way of rendering it more clearly in this environment. I tried various options and settled on
s_\{.*?\}_{$sub[$n++]}_g;
as the least bad.
As you pointed out, perl is very clever about deciding whether a metachar is or is not being used as a metachar, but I still tend to escape them as my brain/eyes are less adept at the art.
I've recently started using [metachar] (unashamedly stolen from some of Abigail-II's posts) instead of \metachar for escaping metachars in regexes as it allows me to use a consistant method for single and multiple alternatives, and I'm finding that consistancy is the key to easy visual parsing of code.
I use m[...] almost universally in preference to /.../, and map{...}; grep{...}; even when the blocking isn't strictly required for similar reasons. I find the visual consistancy and ease of extensibility far outway the minor performance penalty.
I've never yet posted a set of personal style rules as I'm still formulating mine. Very little of my style (or lack thereof) is yet fixed in stone, I tend to see things that other people are doing that seem particularly clear/neat/concise/cool and try them for a while and see what sticks. Those that don't bug me to type, hinder my reading or cause any problems in other ways tend to stay.
There are some which I use in my own code that I relegiously remove from the code when posting. Eg.
.... #! This is a comment
Try as hard as I might to tailor my syntax highlighter definition, it still confuses everything on a line after
$#array
with a comment and highlight it accordingly. So I changed the comment card spec to being #!. This means that all the comments and the shebang lines are displayed in a muted green, which suits me. However, I started removing (mostly) the ! from the comments when posting as a private msg from someone said that everytime he looked at the code I posted with them in, he saw multiple shebag lines and freaked.
I guess I should get around to setting up PerlTidy to filter code to my preferences on input and back to something "more normal" on output, but that probably wouldn't help much as I tend to c&p directly from the editor. I also have a set of macros that do most of the transformations--tab width, curly positioning etc--already in my editor. To use perltidy I would have to save the code out to a file, load it in another editor to view it in the "more normal" form a c&p from there, which would just be a pain.
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] [d/l] [select] |
|
"$\@<!#.*"
which only matches a # not preceded by a $.
kelan
Perl6 Grammar Student | [reply] [d/l] |
Re: Whats your favorite nonstandard regex quote char?
by demerphq (Chancellor) on Apr 22, 2003 at 16:41 UTC
|
<Shameless_Plug>
Incidentally for anyone who cares, you can let Text::Quote figure out the best quote char for you if you like. :-)
</Shameless_Plug>
---
demerphq
<Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
| [reply] [d/l] |
Re: Whats your favorite nonstandard regex quote char?
by Juerd (Abbot) on Apr 22, 2003 at 18:58 UTC
|
So what are the monks at large opinion on this ever so trivial a subject? Which alternate regex delimiter do you favour? And what are the arguments behind your opinion (if any)?
I prefer []. Except for the RHS with s///e, because I prefer {} there.
[] is VERY easy to read, Data::Dumper B::Deparse uses it a lot and it will make even more sense when we have Perl 6's regexes, where [] is used for grouping.
{} often delimits code. It makes sense with s///e because the RHS is code, and creates scope, etc etc.
s[\{.*?\}][{$sub[$n++]}]gx;
I escaped the {} to be on the safe side when something is added in front of it in a future version.
Also note that syntax highlighting helps more than choosing the right delimiter.
Juerd
- http://juerd.nl/
- spamcollector_perlmonks@juerd.nl (do not use).
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
I dont understand your reference to Data::Dumper in this context. Whats that all about?
Neither do I, but that is because I meant B::Deparse, not Data::Dumper. Sorry :)
B::Deparse uses q['] instead of '\'', for example. I like that.
Juerd
- http://juerd.nl/
- spamcollector_perlmonks@juerd.nl (do not use).
| [reply] [d/l] [select] |
Re: Whats your favorite nonstandard regex quote char?
by belg4mit (Prior) on Apr 22, 2003 at 16:31 UTC
|
| [reply] |
Re: Whats your favorite nonstandard regex quote char?
by PodMaster (Abbot) on Apr 23, 2003 at 07:28 UTC
|
I don't think about it much anymore, and mostly use // or {}.
Once upon a time I did fancy the following ( ¡ = 0161, ¿ = 168, º = 167 -- using ALT+DOWN -> NUMPAD -> ALT+UP)
s¡¡¡g
s¿¿¿g
sºººg
I really fancy qw[ ] though.
MJD says you
can't just make shit up and expect the computer to know what you mean, retardo!
I run a Win32 PPM
repository for perl 5.6x+5.8x. I take requests.
** The Third rule of perl club is a statement of fact: pod is sexy.
|
| [reply] [d/l] [select] |
Re: Whats your favorite nonstandard regex quote char?
by John M. Dlugosz (Monsignor) on Apr 22, 2003 at 19:12 UTC
|
How about ¸ (U+00B8) which looks like a fancy comma?
| [reply] |
Re: Whats your favorite nonstandard regex quote char?
by parv (Parson) on Apr 23, 2003 at 05:19 UTC
|
I used to use s###, but lately i find that
that makes the following characters harder to read. So i
use s!!! in short/simple ones;
s{}// or s{}() otherwise.
I absolutely cannot take any of
[;,.] as all three seem to enjoy
intermingling w/ rest of the text.
| [reply] [d/l] [select] |
Re: Whats your favorite nonstandard regex quote char?
by Aristotle (Chancellor) on Apr 26, 2003 at 02:54 UTC
|
Nearly universally the bang for me, when the standard slash won't do. If that still doesn't help, it means the pattern is convoluted and calls for /x, which works best with paired delimiters (curlies for me). I tend to do something like
s{ foo }{ bar }x
which nicely exposes the delimiters to skimming eyes. Add more whitespace if necessary. I'm not keen on splitting the pattern across lines, and prefer to avoid it if it isn't really necessary.
Makeshifts last the longest. | [reply] [d/l] |