Help to slurp records

Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Help to slurp records - $/ maybe? by Joost (Canon) on May 02, 2003 at 13:38 UTC
`$/='Field 1: '` should work, but it doesn't do what you want: $/ is the end-of-line string, which means that with your data the strings will contain: Update: first line was wrong: `"Field 1:", "abc Field 2: asdfasdf Field 2: asdfasdfase Field 2: aaa Field 3: ss Field 1: ", "def Field 2: abc123 Field 3: blah Field 1: ", "asdfa"` [download] You probably want `while(<FILE>) { next unless /^Field 1/; # ... }` [download] To skip everything but Field 1 lines `-- Joost downtime n. The period during which a system is error-free and immune from user input.` [download]	[reply] [d/l] [select]
Re: Re: Help to slurp records - $/ maybe? by Limbic~Region (Chancellor) on May 02, 2003 at 13:48 UTC
Joost, I can see how this would work - if you know what the data is you are losing, once you slurp it in you just prepend it back on before processing it. I am not sure I want to go this route, but I will keep it in my tool box. I am always looking for neat ways to do things. Cheers - L~R	[reply]
Re: Re: Re: Help to slurp records - $/ maybe? by Joost (Canon) on May 02, 2003 at 14:19 UTC
Well, you're not losing any data (with the $/ trick, that is). The value of the $/ variable is also put at the end of the read 'line'. If you want to lose that data, use `chomp()`, it will remove your current line-terminator. The only actual problem is matching the first line. `-- Joost downtime n. The period during which a system is error-free and immune from user input.` [download]	[reply] [d/l] [select]
Re^2: Help to slurp records - $/ maybe? by Aristotle (Chancellor) on May 02, 2003 at 14:59 UTC
Re: Help to slurp records - $/ maybe? by Aristotle (Chancellor) on May 02, 2003 at 14:38 UTC
I'd stick with the conventional approach here. All else is bound to be a dirty/bad hack. You don't like `redo`? Fine, you can use a nested loop. `local $_; do { my $rec = ''; do { $rec .= $_ = <>; } until /^Field 3/ or !defined; # ... } while defined;` [download] Makeshifts last the longest.	[reply] [d/l]
Re: Help to slurp records - $/ maybe? by JaWi (Hermit) on May 02, 2003 at 13:52 UTC
You could use the following (coarse, not fully tested, yada yada yada) approach: `#!/usr/bin/perl -w use strict; use warnings; my $data; { local $/ = undef; $data = <DATA>; } print "Record: [$1]\n" while ( $data =~ /(Field 1:.*?Field 3:[^\n]+)/sg ); __DATA__ Field 1: abc Field 2: asdasdasdf Field 2: asdsaads Field 2: asdf Field 3: asfssadfsad Field 1: abc Field 2: asdf Field 3: asfssadfsad Field 1: abc` [download] Which outputs: `Record: [Field 1: abc Field 2: asdasdasdf Field 2: asdsaads Field 2: asdf Field 3: asfssadfsad] Record: [Field 1: abc Field 2: asdf Field 3: asfssadfsad] Record: [Field 1: abc Field 2: asdasdasdf2 Field 2: asdf3 Field 3: asfssadfsad]` [download] HTH, -- JaWi "A chicken is an egg's way of producing more eggs."	[reply] [d/l] [select]
Re: Re: Help to slurp records - $/ maybe? by Limbic~Region (Chancellor) on May 02, 2003 at 13:55 UTC
JaWi, Thanks - I already know that trick. The problem is it isn't scalable. The larger the file is that you are slurping the more memory you need to have. I am processing logs that are in the vincinity of a gigabyte and have millions of records. Cheers - L~R	[reply]
Re^3: Help to slurp records - $/ maybe? by JaWi (Hermit) on May 02, 2003 at 14:35 UTC
In that case I would go for the "flag option" which isn't such memory consuming as slurping in the whole file. Interesting problem though... HTH, -- JaWi "A chicken is an egg's way of producing more eggs."	[reply]
Re: Help to slurp records - $/ maybe? by hmerrill (Friar) on May 02, 2003 at 13:38 UTC
IMHO, simpler is better - stick with the iterative approach as it will be much easier for you and anyone else who looks at the code to understand and maintain it 6 months or a year from now when it needs to be changed. What you don't want to do is create some obfuscated code that takes someone a day to understand an another day to fix. Make it as simple as possible for any Perl person to understand. HTH.	[reply]
Re: Re: Help to slurp records - $/ maybe? by Limbic~Region (Chancellor) on May 02, 2003 at 13:44 UTC
hmerrill, IMHO, it is much simpler to have `$/ = "";` than it is to have labels, redo statements, and variable flags. Cheers - L~R	[reply] [d/l]
Re: Re: Re: Help to slurp records - $/ maybe? by halley (Prior) on May 02, 2003 at 14:18 UTC
If `$/ = "";` works, great. But it won't for the problem as you stated. What happens when someone needs to insert a 'Field 2b' in the format six months down the road? You would have to fully rewrite a grok-at-once gimmick, but you'd only have to add the support for a new field type if it were a sensible loop. -- `[ e d @ h a l l e y . c c ]`	[reply] [d/l]
Re: Re: Re: Re: Help to slurp records - $/ maybe? by Limbic~Region (Chancellor) on May 02, 2003 at 14:25 UTC
Re: Re: Re: Help to slurp records - $/ maybe? by hmerrill (Friar) on May 02, 2003 at 15:38 UTC
Debatable - I do conceed that the iterative approach means more code, keeping track of flags, etc. - it's not pretty. But I'd have to see a finished slurp example setting $/ = "" before judging that to be the better option. In all likelihood the slurp example may indeed come out on top, but its ease of understanding and maintainability would be greatly enhanced by a generous comment spelling out exactly what the thing does. Whatever you decide on as the final winner, post in this thread.	[reply]
Re: Help to slurp records - $/ maybe? (or flip-flop) by broquaint (Abbot) on May 02, 2003 at 15:25 UTC
Sounds like a great use of the flip-flop operator `my($i, @data) = 0; until(eof FILE) { $data[$i] .= $_ while ($_ = <FILE>) and /^Field 1/ .. /^Field 3/; $i++; }` [download] See. `perlop` for more info. HTH `_________ broquaint`	[reply] [d/l]
Re: Help to slurp records - $/ maybe? by jgallagher (Pilgrim) on May 02, 2003 at 14:55 UTC
This is quite ugly, but I have to go to work and this is just the first thing that popped into my head. It uses the `local $/` like you described. `#!/usr/bin/perl -w use strict; local $/ = "\nField 1: "; my $data = <DATA>; $data =~ s/Field 1: $//; chop($data); print "[$data]\n"; while (<DATA>) { s/Field 1: $//; $_ = "Field 1: $_"; chop; print "[$_]\n"; } __DATA__ Field 1: abc Field 2: asdfasdf Field 2: asdfasdfase Field 2: aaa Field 3: ss Field 1: def Field 2: abc123 Field 3: blah Field 1: asdfa` [download] Output is: `[Field 1: abc Field 2: asdfasdf Field 2: asdfasdfase Field 2: aaa Field 3: ss] [Field 1: def Field 2: abc123 Field 3: blah] [Field 1: asdf]` [download]	[reply] [d/l] [select]


Problems? Is your data what you think it is?
	PerlMonks

Help to slurp records - $/ maybe?