Re: (jeffa) 3Re: More Variable length regex issues

Replies are listed 'Best First'.
(jeffa) 5Re: More Variable length regex issues by jeffa (Bishop) on Jun 09, 2003 at 04:36 UTC
You keep on using that word 'split' ... i do not think you know what it means. ;) Consider the following: `my $str = 'foo,bar,moo,cow'; my @value = $str =~ m/(\w+)\,?/g; print "@value\n"; @value = split(',',$str); print "@value\n"` [download] They both achieve the same results, and guess which one is easier to understand? You say have non-repeatable fields, how does using a regex make this easier than split? What do you think split uses to split? A regex! Besides, oro has a family of split functions. You could always do a series of splits if multiple delimiters are used: `my $str = 'a,b,c:d,e,f:g,h,i'; my @part = split(':',$str); foreach my $part (@part) { my @subpart = split(',',$part); print "@subpart\n"; }` [download] The split functions found in the org.apache.oro package can do this, you just have to jump through more hoops. ;) Not that it matters, but one of my beefs about Java is not being able to process lists easily like you can in Perl: `print $_,$/ for map split(',',$_), split(':', $str);` [download] Best of luck. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l] [select]
Re: (jeffa) 5Re: More Variable length regex issues by dextius (Monk) on Jun 09, 2003 at 05:14 UTC
I am not clearly explaining this issue.. Your examples are not exactly detailing my criteria because I am not fully explaining my problem, I apologize. I have a string of characters that use the same delimiter. Some of the fields are mandatory, some are optional, and some may be repeated infinitely. I want to extract those values AND validate the fields all at once within a single regular expression. I want these values to be available to me afterward. A simple example.. `use Data::Dumper; my $foo = "one,123,a s d f,a,b,c,d,e,f,g,h"; my @bar = $foo =~ /^([a-z]{3}),([0-9]{3}),([a-z\s]{1,7}),(?:([a-z]),\|( +[a-z]$)){1,}/; print Dumper(\@bar);` [download] Consider everything after the 3rd element to repeat, possibly to infinity, but we need to make sure they are single characters, otherwise I want the entire regex to fail immediately. Again, thank you for your time, you have spent more than enough time working with me, and I very much so appreciate it..	[reply] [d/l]
Re: Re: (jeffa) 5Re: More Variable length regex issues by thor (Priest) on Jun 09, 2003 at 05:30 UTC
Whoa, whoa, whoa there. Why do you have the (arbitrary) requirement that everything has to be done in the regex? IMHO, long regexen are what lead to the stereotype of perl looking like line-noise. I would suggest using split, and then validating the elements that you need to validate in separate statements. If you'd like, you can gather up your validation and pack it in to a subroutine. Just try to think of the poor bastard who has to come behind you and maintain the code. Also, minor nit, `"infinite" ne "arbitrary"`. If there were an infinite number of fields, not only would you have run out of disk space by now, but you couldn't do anything with it, since you couldn't hold it in memory. ;) Arbitrary means "as much as you want", whereas infinite means "without end". thor	[reply] [d/l]
Re: Re: Re: (jeffa) 5Re: More Variable length regex issues by BrowserUk (Patriarch) on Jun 09, 2003 at 09:48 UTC
Re: Re: Re: Re: (jeffa) 5Re: More Variable length regex issues by thor (Priest) on Jun 09, 2003 at 12:28 UTC
Some notes below your chosen depth have not been shown here
(jeffa) 7Re: More Variable length regex issues by jeffa (Bishop) on Jun 09, 2003 at 07:42 UTC
Do you still think that you have to perform this task with one regular expression? (and a horribly, unreadable, broken one at that.) split is perfectly cabable of stopping after it finds the, say, 3rd element. Then you can do something different with the rest: `use Data::Dumper; my $foo = 'one,123,a s d f,a,b,c,bad,e,f,g,h'; my @first = split(',',$foo,4); my @rest = split(',',pop @first); print Dumper \@first, \@rest; for (0..$#rest) { die "index $_ is bad: '$rest[$_]'" if length($rest[$_]) != 1; }` [download] Think in chunks. Don't try to swallow the whole pill at once. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]


"be consistent"
	PerlMonks