note
Roy Johnson
Your main loop had a couple of features that struck me as odd:
the anchoring of the pattern, and the destruction of one string
to build another. The if structure also had redundant branching
(you're either going to tag or you're not, so you only need one if).
<P>
In pondering how I'd do it, I discovered why you destroy the string
and create a new one: walking through the string and changing it
becomes complicated. You end up getting lost. (The anchor and
the if-structure points are still valid.)
<P>
By working from the end of the string, and indexing from the front (or
vice-versa), you can insert into the string and not lose your place. A c-style
for loop is handy for this:
<code>
for (my $pos = length($text); $pos ; --$pos)
{
my $char = substr($text, $pos-1, 1);
if ($char =~ /\S/ and (int rand 2)) # 0 or 1
{
substr($text, $pos-1, 1) = "$opening_tag$char$closing_tag";
}
}
</code>
Another technique that came to mind is the one-line main loop:
<code>
$text =~ s/(\S)/(int rand 2) ? "$opening_tag$1$closing_tag" : $1/ge;
</code>
Trying to use a <tt>while(/\S/g)</tt> gets very messy. I was not able to
work it out.
<P>
Then I started thinking about randomly opening and closing the tag, so
you don't end up having to clean up closes followed immediately by opens.
<P>
With a toggle, you're always going to get about 50% of the characters
tagged; the random number just determines how often you go from tagged to
untagged. We need to walk forward through the string, so we'll index from
the end:
<code>
my $tag_is_open = 0;
for (my $pos = length($text); $pos > 0; )
{
my $char = substr($text, -$pos, 1);
if ($char =~ /\S/ and (int rand 2)) # 0 or 1
{
if ($tag_is_open) {
substr($text, -$pos, 1) .= $closing_tag;
$pos-=2; # After closing a tag, skip at least one char
}
else {
substr($text, -$pos, 0) = $opening_tag;
}
$tag_is_open ^= 1; # Toggle
}
else { --$pos }
}
$text .= $closing_tag if $tag_is_open;
</code>
Have you ever appended to substr()? I haven't, before, but it works.
<P>
<div class="pmsig"><div class="pmsig-300037">
<hr>
<small>The PerlMonk <tt>tr///</tt> Advocate</small>
</div></div>
308740
308740