Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

split string at variable position with matching

by Anonymous Monk
on Sep 11, 2021 at 19:35 UTC ( [id://11136666]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello

I have a string a variable length that looks like this

 $string = "hello my * name is Rob * and * I am a very nice person * at least * I think * so"

I need to split the string at the "*" character which comes before position x, where x = 30. In the example above:

$part1 = "hello my * name is Rob * and *"; $part2 = "I am a very nice person * at least * I think * so"

I suspect there must be some backwards regex in Perl. Any idea?

Replies are listed 'Best First'.
Re: split string at variable position with matching
by choroba (Cardinal) on Sep 11, 2021 at 20:11 UTC
    I'd use rindex to find the last * before position 30. Note that you don't split on /\*/ but on /\* /.

    You can use the look-behind (?<= to specify that the space is preceded by that many characters:

    #!/usr/bin/perl use strict; use warnings; my $string = "hello my * name is Rob * and * I am a very nice person * + at least * I think * so"; my $part1 = "hello my * name is Rob * and *"; my $part2 = "I am a very nice person * at least * I think * so"; sub split_before { my ($string, $pos) = @_; my $pos = rindex $string, '*', 30; split /(?<=^.{$pos}\*) /, $string, 2 } use Test2::V0; is "$part1 $part2", $string; is [split_before($string, 30)], [$part1, $part2]; done_testing();

    BTW,

    (split /(^.{,30}\* )/, $string, 2)[1, 2]
    passes the first test, too (but not the second one, as the space is still part of $part1).

    Update: Read below for a fix, thanks AnomalousMonk. I shouldn't work on two different tasks at the same time.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      sub split_before { my ($string, $pos) = @_; my $pos = rindex $string, '*', 30; split /(?<=^.{$pos}\*) /, $string, 2 }

      The quoted code has a problem: the $pos subroutine argument (offset of '*' character of '* ' sequence to be used to split on) is masked by another my $pos definition, and then is not used at all. This is easily fixed. With many (largely imaginary) test cases (update: and there are many corner cases uncovered, but my imagination only stretches so far):

      Note that for any $pos to the left of the leftmost '*' in the string, no meaningful split is returned.


      Give a man a fish:  <%-{-{-{-<

      TIMTOWT-KISS ;)

      Once you have the position with rindex (which is a great idea), why not apply substr instead of a sophisticated regex in split?

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        I wanted to use "some backwards regex" ;-)

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: split string at variable position with matching
by tybalt89 (Monsignor) on Sep 11, 2021 at 20:19 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11136666 use warnings; my $string = "hello my * name is Rob * and * I am a very nice person * + at least * I think * so"; my $x = 30; # or one less, not sure of the exact requirements my ($part1, $part2) = $string =~ /^(.{0,$x}\*)\s*(.*)/; print "$_\n" for $part1, $part2;

    Outputs:

    hello my * name is Rob * and * I am a very nice person * at least * I think * so
Re: split string at variable position with matching
by LanX (Saint) on Sep 12, 2021 at 00:48 UTC
    reusing choroba's rindex idea, here a debugger demo using substr instead of a complicated regex (the p are just for printing the preliminary results)
    DB<34> $string = "hello my * name is Rob * and * I am a very nice pe +rson * at least * I think * so" DB<35> p $pos = rindex $string, ' * ', 30 28 DB<36> p $left = substr $string, 0, $pos+3, "" hello my * name is Rob * and * DB<37> p $string I am a very nice person * at least * I think * so DB<38> p "<$left>" <hello my * name is Rob * and * > DB<39>

    please note the trailing whitespace in $left

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: split string at variable position with matching
by jwkrahn (Abbot) on Sep 12, 2021 at 01:38 UTC
    $ perl -le'$string = "hello my * name is Rob * and * I am a very nice +person * at least * I think * so"; print $string; ( $left, $right ) = + $string =~ / ^ ( .{1,30} \* ) \s* ( .+ ) /x; print for $left, $right +' hello my * name is Rob * and * I am a very nice person * at least * I +think * so hello my * name is Rob * and * I am a very nice person * at least * I think * so
Re: split string at variable position with matching
by Anonymous Monk on Sep 12, 2021 at 14:45 UTC

    Caveat: any regex solution that uses look-behind will not work if the width of the look-behind is more than 255 characters -- at least not for Perls 5.34.0 and earlier.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11136666]
Approved by davies
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-24 11:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found