Re: Regex To remove text between parentheses
by davorg (Chancellor) on Jul 10, 2001 at 18:16 UTC
|
| [reply] |
Re: Regex To remove text between parentheses
by japhy (Canon) on Jul 10, 2001 at 18:59 UTC
|
$re = qr{
\( (?{ local $N = 1 })
(?:
(?(?{ !$N })(?!))
(?:
\( (?{ local $N = $N + 1 })
|
\) (?{ local $N = $N - 1 })
|
[^()]+
)
)+
(?(?{ $N })(?!)) # fixed, thanks to Hofmator
}x;
$text =~ s/$re//g;
japhy --
Perl and Regex Hacker | [reply] [d/l] |
|
$re = qr{
\(
(?:
(?> [^()]+ ) # Non-parens without backtracking
|
(??{ $re }) # Group with matching parens
)*
\)
}x;
$text =~ s/$re//g;
-- Hofmator
| [reply] [d/l] |
|
| [reply] |
Re: Regex To remove text between parentheses
by Hofmator (Curate) on Jul 10, 2001 at 18:21 UTC
|
To give you some hints (if you haven't been quick enough to
read the first version of this node :)
- match literal parenthesis - you have to escape them because of their special meaning
- then match anything in between with non-greedy quantifiers
- match corresponding parenthesis
- you hopefully don't have nested parens because then it gets tricky
Update: OK, so now I probably spoiled
the learning already as davorg and jeroens remarked.
I generally agree with it so I put the code now in the
'spoiler' section and altered the explanation into a hint list - but
probably too late anyway ;-).
-- Hofmator
| [reply] [d/l] |
Re: Regex To remove text between parentheses
by jeroenes (Priest) on Jul 10, 2001 at 18:22 UTC
|
| [reply] |
Re: Regex To remove text between parentheses
by beretboy (Chaplain) on Jul 10, 2001 at 18:25 UTC
|
s/\((.*)\)//g;
but this gets rid of the whole string :-(
"Sanity is the playground of the unimaginative"
-Unknown | [reply] [d/l] |
|
You might want to look at Death to Dot Star! to why it is getting rid of the whole thing
s/\([^)]+\)//g;
(Unless you have nested parenteses, as Hofmator pointed out.) And if you have (), the above won't remove it, unless you change the + to a *.
Also, any reason why you put the contents between the ()'s into $1? Are you going to use it later?
| [reply] [d/l] |
|
(*) Microsoft Internet Exploder (which calls itself "Mozilla") - a pro
+duct of a convicted violator of the Sherman Act (wow)
That greedy quantifier will snatch up that * after the opening paren, and not stop till it gets to the ) that follows the "wow" (that IS the longest substring that matches your RE).
As to the suggestion you use a character class (hint: match one or more things that aren't closing parentheses), see Death to Dot Star! for a discussion of why not to use .*
HTH!
perl -e 'print "How sweet does a rose smell? "; chomp ($n = <STDIN>);
+$rose = "smells sweet to degree $n"; *other_name = *rose; print "$oth
+er_name\n"'
| [reply] [d/l] [select] |
|
You're being bitten by the "greediness" of regular
expressions. They try to match as much of the string as
possible - which, in this case, is all of it.
To make the regex "non-greedy" put a '?' after the
greedy part of the regex.
You might also like to take a look at Death to Dot
Star!
--
<http://www.dave.org.uk>
Perl Training in the UK <http://www.iterative-software.com>
| [reply] |
|
As a quick, cludgy fix; try (.+) instead of (.*)
Update: OK, so this headache is affecting my mental powers -- ignore that previous bit and instead look at the answers invloving negated classes (^)*) and the like - In my defence, I said that was what you're supposed to do below:
As a better fix, create a character class that doen't include brackets, and use that instead...
--
RatArsed, in search of enlightment and asprin.
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
Re: Regex To remove text between parentheses
by the_slycer (Chaplain) on Jul 10, 2001 at 18:44 UTC
|
This works fairly well. It can also get rid of nested parentheses like this: (begin(next(third))). Though not like (begin(next)third).
s/\([^)]+\)+//g;
| [reply] [d/l] |
|
| [reply] [d/l] [select] |