Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Split using multiple conditions

by juo (Curate)
on Jun 11, 2005 at 11:24 UTC ( [id://465789]=perlquestion: print w/replies, xml ) Need Help??

juo has asked for the wisdom of the Perl Monks concerning the following question:

I have been looking to split a line using multiple conditions but have failed to do so. Anybody has an idea.

FDR [62.10060.051-F] [62.10051.381] 0 1 0

For example I want to split the above on space but if I have brackets it should take the whole string and ignore spaces within the bracket area. So in total I want to have six fields. I would like to do this in one split line.

# This can only work untill the first bracket my @feeder_line = split/\s+\[/;

Replies are listed 'Best First'.
Re: Split using multiple conditions
by bart (Canon) on Jun 11, 2005 at 11:31 UTC
    That's an official FAQ: perlfaq 4: How can I split a [character] delimited string except when inside [character]?

    Personally, I'd be inclined to use the dual approach: match the stuff between brackets, or nonspaces.

    $_ = 'FDR [62.10060.051-F] [62.10051.381] [this includes spaces!] 0 1 +0'; @parts = /\[.*?\]|[^\[\]\ ]+/g; $\ = "\n"; print for @parts;

    Yes it can be that compact. Result:

    FDR [62.10060.051-F] [62.10051.381] [this includes spaces!] 0 1 0

    A limitation is that you can't easily split on single spaces, thus returning empty strings as a section.

Re: Split using multiple conditions
by mda2 (Hermit) on Jun 11, 2005 at 15:09 UTC
    The bart give a great response! But to understand your question... Your split regex need a quantifier:
    $_ = 'FDR [62.10060.051-F] [62.10051.381] 0 1 0'; @f1 = split/\s+\[/; #>> split only \s+ AND [ ... @f2 = split/\s+\[?/; #>> split \s+ OR \s+[ ... @f3 = split/\]?\s+\[?/; #>> split parts, without []... print join(" + ", @f1), "\n"; print join(" + ", @f2), "\n"; print join(" + ", @f3), "\n"; __END__ FDR + 62.10060.051-F] + 62.10051.381] 0 1 0 FDR + 62.10060.051-F] + 62.10051.381] + 0 + 1 + 0 FDR + 62.10060.051-F + 62.10051.381 + 0 + 1 + 0

    --
    Marco Antonio
    Rio-PM

Re: Split using multiple conditions
by ikegami (Patriarch) on Jun 11, 2005 at 16:15 UTC

    You can use a single expression like bart showed, but I find the following easier to understand (and maintain):

    # Seperate the fields. my @feeder_line = split /\s+/; # Clean up the data: # Remove the brackets from the 2nd and 3rd fields. foreach (@feeder_line[1, 2]) { s/^\[//; s/\]$//; }
Re: Split using multiple conditions
by dws (Chancellor) on Jun 11, 2005 at 21:29 UTC

    Nother alternative is to remove the brackets first, then split.

    my ($nobrackets = $_) =~ s/(\[|\])//g; my @feeder_line = split ' ', $nobrackets;

      Unfortunately, this doesn't do quite what the original poster asked for. Consider [id://bart]'s code snippet above and plug it into yours:

      $_ = 'FDR [62.10060.051-F] [62.10051.381] [this includes spaces!] 0 1 +0'; ($nobrackets = $_) =~ s/(\[|\])//g; @feeder_line = split ' ', $nobrackets; $\ = "\n"; print for @feeder_line; __END__ FDR 62.10060.051-F 62.10051.381 this includes spaces! 0 1 0

      N.B.: I have removed the mys because my ($nobrackets = $_) ... results in the error message Can't use global $_ in "my" at - line 1, near "= $_" (the correct syntax is (my $nobrackets = $_) ...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://465789]
Approved by polettix
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-24 23:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found