No such thing as a small change | |
PerlMonks |
Re^2: Regex Extraction Helpby Flexx (Pilgrim) |
on Aug 09, 2012 at 17:02 UTC ( [id://986558]=note: print w/replies, xml ) | Need Help?? |
Well, I guess all the fields are variable, and what invaderzard meant, was to get that second field. So I'd suggest this:
Now, what am I doing here? First I say: Let's start at the beginning (^). This is important, since we can't exclude the possibility that the pattern repeats in one instance of $line. Next, I say: give me zero or more non-semicolon characters ([^;]*), followed by exactly one semicolon (;). Now our "cursor" would be in the second field, quasi. We say, well, there might or might not be some leading space (\s*). Then comes the data we want, that's why we use parentheses to capture it. What do we wanna capture? Well, again, anything not a semicolon ([^;]*?), but this time, non-greedily (using the *? quantifier.). Well, that's because we want any trailing space to go into the \s* that follows, instead of it being captured. Lastly, we need to require that the field is terminated by exactly one semicolon (;). If you want to capture other fields as well, then a solution using split, like it's been suggested below is a more efficient way of doing it. If you want just a few fields of a long CSV record (which this seems to be, only demimited by semicola instead of kommas, then you also could expand on the regexp above, which might be a bit more performant than split. But I didn't really check that with benchmarks. Just an inkling I'd have, and very dependent on the length of the input, and the number of fields in it. Cheers,
In Section
Seekers of Perl Wisdom
|
|