Re: Replacing a given character starting with the xth occurence in a string
by quent (Beadle) on May 21, 2001 at 03:44 UTC
|
my $nth = 3;
my $rep = 1;
my $chr = 'e';
my $i = 1;
$_ = "Terence and Philip are sweet";
s/($chr)/$i++<$nth?$1:$rep/eg;
print;
| [reply] [d/l] |
(bbfu) (another way) Re: Replacing a given character starting with the xth occurence in a string
by bbfu (Curate) on May 21, 2001 at 04:16 UTC
|
my $nth = 3;
my $rep = 1;
my $chr = 'e';
my $str = "Terence and Philip are sweet\n";
my $pos = 0;
$pos = index($str, $chr, $pos+1) for(1..$nth);
substr($str, $pos) =~ s/$chr/$rep/g;
print $str;
You could, of course, move the $pos finding part into a do block within the substr expression but it's a bit unwieldy.
Update: Well, my method's a little bit more wordy then quent's but it's ever-so-slightly faster, almost definately because it avoids doing the string-eval. I don't think it's a significant difference, though. I was just curious. :-)
my $s = "Terence and Philip are sweet\n";
my $c = 'e';
my $r = 1;
my $n = 3;
sub mine {
my ($str, $chr, $rep, $nth) = @_;
my $pos = 0;
$pos = index($str, $chr, $pos+1) for(1..$nth);
substr($str, $pos) =~ s/$chr/$rep/g;
return $str;
}
sub quents {
my ($str, $chr, $rep, $nth) = @_;
my $i = 1;
$str =~ s/($chr)/$i++<$nth?$1:$rep/eg;
return $str;
}
use Benchmark qw( timethese cmpthese );
cmpthese(500000, {
"Mine" => sub { mine ($s, $c, $r, $n) },
"Quent's" => sub { quents ($s, $c, $r, $n) },
});
__END__
Benchmark: timing 500000 iterations of Mine, Quent's...
Mine: 14 wallclock secs (14.08 usr + 0.00 sys = 14.08 CPU) @ 35
+511.36/s (n=500000)
Quent's: 19 wallclock secs (18.58 usr + 0.03 sys = 18.61 CPU) @ 26
+867.28/s (n=500000)
Rate Quent's Mine
Quent's 26867/s -- -24%
Mine 35511/s 32% --
bbfu
Seasons don't fear The Reaper.
Nor do the wind, the sun, and the rain.
We can be like they are. | [reply] [d/l] [select] |
|
Your solution has a few subtle bugs, try it with:
print mine ("Terence and Philip are sweet", 'c', 'Z', 3);
This happens because $pos is set to -1 after not finding a second 'c', and since you are indexing from $pos + 1, which has become 0, index searches from the start of the string, again.
You should also initialize $pos to -1, not zero, for similar reasons (otherwise it ignores the first character in the string).
Getting this problem right, without boundary or fencepost erros, employing manual indexing and positioning is remarkably tricky. So much so, that I'm not even sure the following fixed-up code is error-free:
sub crep {
my ($str, $chr, $rep, $nth) = @_;
my $pos = 0;
while (--$nth > 0) {
$pos = index $str, $chr, $pos;
last if $pos < 0;
$pos++;
}
substr ($str, $pos) =~ s/$chr/$rep/g if $pos >= 0;
return $str;
}
I would really like to see if this can be improved upon, assuming the same method is used. In the meantime, here's a slight variant, which I believe to be correct:
sub crep {
my ($str, $chr, $rep, $num) = @_;
my $tstr = '';
$tstr .= substr $str, 0, (1 + index $str, $chr), '' while --$num;
$str =~ s/$chr/$rep/g;
$tstr.$str;
}
update: I'm beginning to believe that this problem is the poster-child for unit testing! A more promising possibility... caveat emptor as always:
sub crep {
my ($str, $chr, $rep, $num) = @_;
$str !~ /$chr/g and return $str while $num--;
substr ($str, -1 + pos $str) =~ s/$chr/$rep/g;
return $str;
}
Of course, this one dances around the index problem with a regex.
update2: Improving on the original fix...
sub crep {
my ($str, $chr, $rep, $nth) = @_;
my $pos = 0;
($pos = index $str, $chr, $pos)++ < 0 and return $str while --$nth;
substr ($str, $pos) =~ s/$chr/$rep/g;
return $str;
}
MeowChow
s aamecha.s a..a\u$&owag.print | [reply] [d/l] [select] |
|
here comes another way :)
#!/usr/local/bin/perl -w
use strict;
my $s = "Terence and Philip are sweet\n";
my $c = 'e';
my $r = 1;
my $n = 3;
sub mine
{
my ($str,$chr,$new,$nbr)=@_;
my $reg="(.+?)$chr" x --$nbr;
$_=$str;
(/$reg/)&&($str=$&)&&($_=$')&&(s/$chr/$new/g);
return $str.$_;
}
print mine($s,$c,$r,$n);
The code for benchmarking agains BBFU's proc and Quent's
#!/usr/local/bin/perl -w
use strict;
my $s = "Terence and Philip are sweet\n";
my $c = 'e';
my $r = 1;
my $n = 3;
sub mine
{
my ($str,$chr,$new,$nbr)=@_;
my $reg="(.+?)$chr" x $nbr;
$_=$str;
(/$reg/)&&($str=$&)&&($_=$')&&(s/$chr/$new/g);
return $str.$_;
}
sub bbfus {
my ($str, $chr, $rep, $nth) = @_;
my $pos = 0;
$pos = index($str, $chr, $pos+1) for(1..$nth);
substr($str, $pos) =~ s/$chr/$rep/g;
return $str;
}
sub quents {
my ($str, $chr, $rep, $nth) = @_;
my $i = 1;
$str =~ s/($chr)/$i++<$nth?$1:$rep/eg;
return $str;
}
use Benchmark qw( timethese cmpthese );
cmpthese(500000, {
"BBFU'S" => sub { bbfus ($s, $c, $r, $n) },
"Quent's" => sub { quents ($s, $c, $r, $n) },
"MINE" => sub { mine ($s,$c,$r,$n) },
});
AND THE BENCHMARKS :)
Benchmark: timing 500000 iterations of BBFU'S, MINE, Quent's...
BBFU'S: 20 wallclock secs (19.45 usr + 0.00 sys = 19.45 CPU) @ 25
+706.94/s (n=500000)
MINE: 19 wallclock secs (19.87 usr + 0.02 sys = 19.88 CPU) @ 25
+146.69/s (n=500000)
Quent's: 25 wallclock secs (24.97 usr + 0.00 sys = 24.97 CPU) @ 20
+026.70/s (n=500000)
Rate Quent's MINE BBFU'S
Quent's 20027/s -- -20% -22%
MINE 25147/s 26% -- -2%
BBFU'S 25707/s 28% 2% --
| [reply] [d/l] [select] |
Re: Replacing a given character starting with the xth occurence in a string
by sachmet (Scribe) on May 21, 2001 at 04:28 UTC
|
Method without using temporary variables:
my $x = 3;
my $chr = 'e';
my $newchr = '1';
my $str = "Terence and Philip are sweet!\n";
print $str;
$x--;
1 while ($str =~ s/(([^$chr]*$chr){$x}([^$chr]*))$chr/$1$newchr/g);
print $str;
| [reply] [d/l] |
Re: Replacing a given character starting with the xth occurence in a string
by ZZamboni (Curate) on May 21, 2001 at 04:16 UTC
|
$n=2; # x-1
$chr="e";
$rep="1";
$a="Terence and Philip are sweet";
while ($a=~s/^((?:.*?$chr){$n})(.*?)$chr/$1$2$rep/) {}
--ZZamboni
Update: At ar0n's suggestion, the last line can
also be written as:
1 while ($a=~s/^((?:.*?$chr){$n})(.*?)$chr/$1$2$rep/);
which results in almost exactly what sachmet wrote. I like
sachmet's answer better too because it uses only one capturing
set of parenthesis. | [reply] [d/l] [select] |
Re: Replacing a given character starting with the xth occurence in a string
by tachyon (Chancellor) on May 21, 2001 at 04:33 UTC
|
Here is a different way using progressive matching and the
\G assertion to continue matching from that point. TIMTOWTDI
tachyon
my $nth = 3;
my $rep = 1;
my $chr = 'e';
my $str = "Terence and Philip are sweet";
$str =~ m/$chr/gc for (1..$nth); # eat up first $nth "$chr"s
$str =~ s/\G(.*?)$chr/$1$rep/g; # replace the rest
print $str;
| [reply] [d/l] |
Re: Replacing a given character starting with the xth occurence in a string
by japhy (Canon) on May 21, 2001 at 05:10 UTC
|
sub adv_tr {
my ($str, $x, $from, $to) = @_;
my $pos = 0;
# find index of Xth "$from"
($pos = index($str, $from, $pos)) == -1 and return while $x--;
eval "substr(\$str, $pos) =~ tr/\Q$from\E/\Q$to\E/";
}
japhy --
Perl and Regex Hacker | [reply] [d/l] |
|
sub adv_tr {
my ($str, $x, $from, $to) = @_;
my $pos = -1;
($pos = index($str, $from, $pos + 1)) == -1 and return $str while $x
+--;
eval "substr(\$str, $pos) =~ tr/\Q$from\E/\Q$to\E/";
$str;
}
MeowChow
s aamecha.s a..a\u$&owag.print | [reply] [d/l] |
|
| [reply] |
(boo) Re: Replacing a given character starting with the xth occurence in a string
by boo_radley (Parson) on May 21, 2001 at 19:24 UTC
|
I love all of the answers to this question, especially ZZamboni's, satchmet's and tachyon's.
I went over and said to @coworker, "lookie this!", and they replied "what the hell does that do?"
I explained what the topic was.
$coworker[1] says "ah, that's faboo, but how does it do it?" and it took me about 5 minutes to puzzle through ZZamboni's. He explained that the idea's useful, but the implementation was a bit opaque for him. We chatted for a few more, and then I suggested the following, which is a much different take.
use strict;
my $s=n_sub ("Toy boats are for the little boys" ,2,"b","g");
$s=n_sub ($s,4,"o","i");
$s=n_sub ($s,2,"y","rl");
print $s;
sub n_sub {
my ($os, $xth, $ic, $oc) = @_;
my @el= split /$ic/,$os, $xth;
$el[-1] =~ s/$ic/$oc/g;
return join ($ic, @el);
}
I don't suggest that this may be a faster implementation, or a better one, but for those that I showed, it's more understandable.
I'm not knocking anyone's reg-fu, of course, but even some basic concepts like ?: confound some, and I wanted to show TIMTOWTDI.
on a related subject, isn't this concept called lookbehind, as in the owl, pg 229 and 230? | [reply] [d/l] [select] |
Re: Replacing a given character starting with the xth occurence in a string
by sharle (Acolyte) on May 22, 2001 at 03:30 UTC
|
Well, I guess I deserved to lose an experience point for my hastily posted "solution" yesterday. In an effort to redeem myself, here is yet another solution to this problem:
#!/usr/bin/perl -w reg.pl
my ($p, @q, $matchchar, $nummatch, $rep, $count, $q, $out);
$p = 'Terrence and Phillip are sweet';
$count = 0;
$matchchar = "e";
$nummatch = 3;
$repchar = "1";
@q = split (/(.*?)/, $p);
for($i = 0; $i < $#q + 1; $i++) {
$_ = $q[$i];
$count += 1 if (/$matchchar/);
if (/$matchchar/ && ($count > $nummatch)) {
$q[$i] = $repchar;
}
}
$out = join ("", @q[0 .. $#q]);
print "out === $out";
This one works, and solves the correct problem.
sharle | [reply] [d/l] |
|
I can see that you put some effort in this program, so I have
++'d your post. Now I'll offer some comments as constructive
criticism, and a rewriting of your program to show some
more Perlish ways of doing things:
- The following line:
@q = split(/(.*?)/, $p);
Is doing a lot more work than it should. You are actually
splitting on empty strings (that's what .*? will always
evaluate to) and storing the delimiters (the empty strings),
so the string "foo"
gets split as ("f", "", "o", "", "o"). If you want to split
in individual characters, it's better to do:
@q = split(//,$p);
which splits on an empty string, but without the regex,
and does not store the delimiters, which you do not need
anyway, and does not affect the subsequent code.
- A matter of style and possibly efficiency: I would
much prefer using $i <= $#q as the termination
condition in your for. But the really Perlish way of
doing it would be:
foreach (@q) {
which also automatically assigns $_ for you on each iteration.
- You don't really need to use regular expressions to do
the matching, you could use eq instead, both for clarity
and possibly for efficiency.
- $count++ could be used instead of $count+=1.
- When you are using $_ inside of a foreach, it becomes
an implicit reference to each element of the array, so in
this case it has an associated side benefit: you can assign
to $_ to modify the array, instead of assigning to $q[$i].
- The logic could be rearranged to only check against
$matchchar once.
- As per the original specification of the problem, you
are to replace starting with the Nth occurrence, so the check
should be $count >= $nummatch.
- You don't need to use @q[0 .. $#q]! Saying @q
by itself represents the whole array.
So here's my first rewrite of the main section of your program:
@q = split (//, $p);
foreach (@q) {
$count++ if $_ eq $matchchar;
if ($_ eq $matchchar && ($count >= $nummatch)) {
$_ = $repchar;
}
}
$out = join ("", @q);
print "out === $out";
Restructuring the insides of the loop, we can get:
foreach (@q) {
if ($_ eq $matchchar) {
if (++$count >= $nummatch) {
$_ = $repchar;
}
}
}
Now compressing the two if's, we get:
foreach (@q) {
if ($_ eq $matchchar && ++$count >= $nummatch) {
$_ = $repchar;
}
}
Now, notice that we are assigning one value to $_ when a
certain condition is satisfied, and another (actually leaving
its old value) when it's not. So we could use the
conditional operator to eliminate the if altogether:
foreach (@q) {
$_ = ($_ eq $matchchar && ++$count >= $nummatch)?$repchar:$_;
}
And now, notice that we are using the foreach to compute
a value based on each element of @q. Ideal use of map!
@q = map { ($_ eq $matchchar &&
++$count >= $nummatch)?$repchar:$_ } @q;
And now we don't need to initially asign the result of
the split to @q, because all we are doing with it is
passing it as argument to map, so we can do:
@q = map { ($_ eq $matchchar &&
++$count >= $nummatch)?$repchar:$_ }
split(//, $p);
And finally, we can eliminate @q altogether because we can
pass the result of the map directly to the join:
$out = join ("",
map { ($_ eq $matchchar &&
++$count >= $nummatch)?$repchar:$_ }
split (//, $p));
Proof that any program can be transformed to
a one-liner in Perl :-)
Man that was fun :-)
--ZZamboni
| [reply] [d/l] [select] |
|
This was great! Thanks!
I spent a huge amount of time trying to figure out how to split on characters, I tried everything except what you used. (I was at work, and so only had the man pages to work from).
All of your tips were greatly appreciated. I didn't know that foreach automatically assigned $_. Useful knowledge, that.
I like the result much better than my original, and it was very cool to see how to use map.
Still Learning,
sharle
| [reply] |
Re: Replacing a given character starting with the xth occurence in a string
by Zaxo (Archbishop) on May 22, 2001 at 07:21 UTC
|
Here's another way that avoids a lot of fuss. It's rather like quent's but avoids unneeded churning on the counter.
It also works with multicharacter substitutions and, with some odd results unless the start count < 1, the target can be a regex.
#!/usr/bin/perl -w # -*-Perl-*-
use strict;
sub zaxo {
my ($str,$chr,$new,$nbr)=@_;
$str=~s/$chr/(--$nbr>0)?$chr:$new/geo;
$str;
}
my $s = "Terence and Philip are sweet\n";
my $c = 'e';
my $r = 1;
my $n = 3;
print zaxo( $s,$c,$r,$n);
exit;
Against the bbfu benchmark posted above (corrected &mine output) I get:
Benchmark: timing 500000 iterations of BBFU'S, MINE, Quent's, Zaxo's..
+.<BR/>
BBFU'S: 12 wallclock secs (12.18 usr + 0.05 sys = 12.23 CPU)
MINE: 13 wallclock secs (12.32 usr + 0.06 sys = 12.38 CPU)
Quent's: 13 wallclock secs (13.40 usr + 0.03 sys = 13.43 CPU)
Zaxo's: 10 wallclock secs (10.59 usr + 0.00 sys = 10.59 CPU)
I must to agree with MeowChow that this is a litmussy kind of question.
After Compline,
Zaxo
| [reply] [d/l] [select] |
Re: Replacing a given character starting with the xth occurence in a string
by Dr. Mu (Hermit) on May 22, 2001 at 10:53 UTC
|
The following works without using any explicit
iterations. I haven't clocked its execution against the
others, though.
$s = 'abc nbc cbs fox hbo sho cnn rox';
($orig, $num, $subst) = ('n', 2, '-');
$s =~ s/^(([^$orig]*$orig){$num})(.*)/$1/;
($t = $3) =~ s/$orig/$subst/g;
$s .= $t;
print $s
This prints:
abc nbc cbs fox hbo sho cn- rox
It works by cutting the string into two parts, applying the
substitution to the second part (if there is one) and
rejoining them. | [reply] [d/l] [select] |
Re: Replacing a given character starting with the xth occurence in a string
by sharle (Acolyte) on May 21, 2001 at 10:09 UTC
|
Um, I'm pretty new to perl, but your problem is very simple really. I think everyone has tried to make it much more complicated a problem than it really is.
Try this:
#!/usr/perl -w
my p$ = 'Terrence and Phillip are sweet';
p$ =~ s/w/1/g;
p$ will then contain "T1rr1nc1 and Phillip ar1 sw11t", I think you'll find.
sharle | [reply] [d/l] |
|
You wouldn't be the first person here to misread (and therefore
mis-answer) a question, so we might all want to be lenient on that
account (but do go back and read the original question again, it
isn't quite a simple substitution question). But everyone should take
the time to test the code that they post at least minimally. Your
code does not compile because the variable should be $p not p$, and
making that change doesn't give the result you stated because your
code replaces w's instead of e's. It looks as if you are trying to be
careful by using -w and my() variables, so maybe you tested code and
then retyped it for submission, making typos as you went. Copy and
paste is a much safer way to participate in online forums.
| [reply] |
|
You're right of course, my code should have been:
#!/usr/bin/perl -w
my $p = 'Terrence and Philip are sweet';
$p =~ s/e/1/g;
print "$p";
which is a direct copy and paste, but still doesn't give the desired result, I see after re-reading. I'll have to think about that a little harder. Now I see why it wasn't as easy as I thought at first go.
sharle | [reply] [d/l] |
|
Re: Replacing a given character starting with the xth occurence in a string
by tachyon (Chancellor) on May 22, 2001 at 18:13 UTC
|
Complete bastard shows how monk's code can be shoretened by
50+% and still do same.......
# ------------- short summary
my $p = 'Terrence and Phillip are sweet';
my ($count,$matchchar,$nummatch,$repchar) = qw(0 e 3 1);
my @q = split//,$p;
for(@q) {
next unless /$matchchar/;
$count ++;
$_ = $repchar if $count > $nummatch;
}
print 'out === ',@q
# ------------- end short summary.
# First you are not using strict which would have pointed
# out the fact that you declare a var $rep but use a var called
# $repchar and forget to declare $i with my.
#
#!/usr/bin/perl -w reg.pl
# my ($p, @q, $matchchar, $nummatch, $rep, $count, $q, $out);
# $p = 'Terrence and Phillip are sweet';
# $count = 0;
# $matchchar = "e";
# $nummatch = 3;
# $repchar = "1";
#
# Bulk my declaration at top kind of looses the scoping value of
# my as all these vars are effectively global, although it makes
# no difference in your script I like to keep my 'mys' local so
# as to speak
# $rep is unused, this is supposed to be $repchar but with no
# strict you have not allowed perl to tell you.
# I have declared and assigned vars almost all at once here
#
my $p = 'Terrence and Phillip are sweet';
my ($count,$matchchar,$nummatch,$repchar) = qw(0 e 3 1);
#
# @q = split (/(.*?)/, $p);
#
# Your split syntax is an odd way to split on null, how about
#
my @q = split//,$p;
#
# for($i = 0; $i < $#q + 1; $i++) {
# $_ = $q[$i];
# $count += 1 if (/$matchchar/);
# if (/$matchchar/ && ($count > $nummatch)) {
# $q[$i] = $repchar;
# }
# }
#
# You can clean this iteration up heaps.
# You could write
# for my $i(0..$#q) {
# which would shorten it down and cure the missing my but there
# is more. All you need is a
# for(@q) {
# within this loop each element of the @q array is aliased to the
# magical variable $_ If we modify $_ we modify that array element.
# Also your logic can be improved as you test for a match to
# $matchchar twice wich is unecessary. I use a next unless
# construct as it makes it obvious what this loop does - if we do
# not /$matchchar/ we do the next iteration, read no further!
# So without further ado, let's just do:
#
for(@q) {
next unless /$matchchar/;
$count ++;
$_ = $repchar if $count > $nummatch;
}
#
# $out = join ("", @q[0 .. $#q]);
# print "out === $out";
#
# join'',@q is much shorter than join ("", @q[0 .. $#q]) and does
# the same but but this is shorter.
# print "out ===", join'',@q;
# has the same effect as these two lines and skips the unecessary
# asignment to $out
# As print @foo is the same as print join '', @foo;
# we can shorten this further to:
#
print 'out === ',@q;
#
# BTW
# print "@foo";
# is the same as:
# print join $", @foo;
# The output record seperator $" is set to ' ' by default, but you
# can set it to anything you want. So $"='';print "out === @q"; is
# yet another variation.
# Here is my version of your code, which is remarkably like the
# code I did not post in the first place as the Camel book says
# when you think you want to chop a string up into substrings
# what you really eant is the \G asertion.
=pod
#!/usr/bin/perl
use strict;
my $p = 'Terrence and Phillip are sweet';
my ($count,$matchchar,$nummatch,$repchar) = qw(0 e 3 1);
my @q = split//,$p;
for(@q) {
next unless /$matchchar/;
$count ++;
$_ = $repchar if $count > $nummatch;
}
print 'out === ',@q;
=cut
| [reply] [d/l] |
Re: Replacing a given character starting with the xth occurence in a string
by zeidrik (Scribe) on May 22, 2001 at 16:58 UTC
|
Here is possibly the most compact way (not the fastest yet)
my $s = "Terence and Philip are sweet\n";
my $c = 'e';
my $r = 1;
my $n = 3;
($s=~/((.*?)$c){$n}/)&&(($s,$_)=($&,$'))&&(s/$c/$r/g)&&(print $s,$_);
| [reply] [d/l] |
|
| [reply] [d/l] [select] |