split every second word

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: split every second word by broquaint (Abbot) on Dec 04, 2003 at 15:32 UTC
If you're working with simple words then a quick match should do the trick e.g `my $str = "Bob Builder Tinky Winky Bugs Bunny Mickey Mouse"; my @names = $str =~ /(\w+ \w+)/g; print "[$_]\n" for @names; __output__ [Bob Builder] [Tinky Winky] [Bugs Bunny] [Mickey Mouse]` [download] See. `perlre` and `perlop` for more info. HTH `_________ broquaint`	[reply] [d/l]
Re: split every second word by shockme (Chaplain) on Dec 04, 2003 at 15:56 UTC
broquaint is pretty much right on. Given your example input, it'll work. As to removing spaces, the following will do it: `my $str = "Bob "; $str =~ s/\s$//; # or s/ $//; print "\|$str\|\n";` [download] If things get any worse, I'll have to ask you to stop helping me.	[reply] [d/l]
Re: split every second word by Art_XIV (Hermit) on Dec 04, 2003 at 16:24 UTC
`use strict; while (<DATA>) { print ">", trim($_), "<\n"; } sub trim { my ($text) = @_; $text =~ s/^\s+\|\s+$//g; #remove leading/trailing whitespace return $text; } __DATA__ Cowboy Bebop Bubblegum Crisis Big O Dragonball Z` [download] Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"	[reply] [d/l]
Re:x2 split every second word (don't use a single s/// to trim both ends of a string) by grinder (Bishop) on Dec 04, 2003 at 17:05 UTC
`$text =~ s/^\s+\|\s+$//g;` This can be hopelessly inefficient (the regexp engine gets bogged down in the middle of the string, looking for hypothetical end anchors). The longer the string gets, the better it is to write: `$text =~ s/^\s+//g; $text =~ s/\s+$//g;` [download] Consider the following, somewhat pathological cases: #! /usr/local/bin/perl -w use Benchmark qw/:all/; ` my $long = ' aaa bbb ccc' . (' ' x 100).'ggg hhh '; my $short = ' aaa bbb ccc ddd eee fff ggg hhh '; sub one_long { my $s = $long; $s =~ s/^\s+\|\s+$//g; $s; } sub one_short { my $s = $short; $s =~ s/^\s+\|\s+$//g; $s; } sub two_long { my $s = $long; $s =~ s/^\s+//g; $s =~ s/\s+$//g; $s; } sub two_short { my $s = $short; $s =~ s/^\s+//g; $s =~ s/\s+$//g; $s; } print "tests:\n"; { no strict 'subs'; print "$_ [", &$_, "]\n" for qw/one_long one_short two_long two_short/; } cmpthese( shift \|\| 1000, { one_long => \&one_long, one_short => \&one_short, two_long => \&two_long, two_short => \&two_short, } ); __PRODUCES__ tests: one_long [aaa bbb ccc + ggg hhh] one_short [aaa bbb ccc ddd eee fff ggg hhh] two_long [aaa bbb ccc + ggg hhh] two_short [aaa bbb ccc ddd eee fff ggg hhh] Benchmark: timing 100000 iterations of one_long, one_short, two_long, +two_short... one_long: 8 wallclock secs ( 7.13 usr + 0.00 sys = 7.13 CPU) @ 14 +019.72/s (n=100000) one_short: 2 wallclock secs ( 1.62 usr + 0.00 sys = 1.62 CPU) @ 61 +835.75/s (n=100000) two_long: 1 wallclock secs ( 0.62 usr + 0.00 sys = 0.62 CPU) @ 16 +0000.00/s (n=100000) two_short: 0 wallclock secs ( 0.63 usr + 0.00 sys = 0.63 CPU) @ 15 +8024.69/s (n=100000) Rate one_long one_short two_short two_long one_long 14020/s -- -77% -91% -91% one_short 61836/s 341% -- -61% -61% two_short 158025/s 1027% 156% -- -1% two_long 160000/s 1041% 159% 1% -- [download] It should be obvious from the results that it's good insurance to write it in the two s/// form and be done with it :)	[reply] [d/l] [select]
Re: Re:x2split every second word (don't use a single s/// to trim both ends of a string) by Art_XIV (Hermit) on Dec 04, 2003 at 18:11 UTC
Grinder - Thanks for the correction and the downloadable code! I have use the 'one_long' regex wayyy to many times w/o realizing there is a better way. Thanks for the enlightenment! Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"	[reply]
Re: split every second word by Anonymous Monk on Dec 04, 2003 at 23:57 UTC
so if i had "hello " it would give me "hello"...will s// /; do it? Did you try it? Observe: `$_=1234; s// /; print __END__ 1234` [download]	[reply] [d/l]
Re: split every second word by duff (Parson) on Dec 05, 2003 at 00:21 UTC
Not that it helps your particular problem (besides, I think you've been adequately answered), but this would be easy to implement using perl 6 in terms similar to how you framed your question: `@names = split m:each:2nd/\s+/, $string;` [download] I.e., split the $string on each 2nd occurence of one or more whitespace characters. PerlJam	[reply] [d/l]
Re: split every second word by jweed (Chaplain) on Dec 04, 2003 at 15:43 UTC
Well, I'm not entirely sure about the nature of your text. If it is in a file, one name to each line like you say it is (well you say that they basically are): `Bob Builder Tinky Winky etc.` [download] then you can simply open the file and read it into an array like this: `my @names = <FH>` [download] If they are all on one line like your example, it is a bit trickier. But, since this seems like it might be a homework problem, I'll have you figure it out. Update Well, I guess I misread your original post. You left out the to in got to put them on separate lines. Don't mind me. Who is Kayser Söze?	[reply] [d/l] [select]


Problems? Is your data what you think it is?
	PerlMonks