Apply regex to entire file, not just individual lines ?

Replies are listed 'Best First'.
Re: Apply regex to entire file, not just individual lines ?⭐ by nuance (Hermit) on May 24, 2000 at 17:46 UTC
You can read the entire file into a scalar variable like this `{ open(FILE, "$filename") or die "Cant open $filename\n"; local $/ = undef; $lines = <FILE>; close(FILE); }` [download] Then you can just use your normal regular expression, but you'll probably want to use at least one of the following modifiers (from the perlre manpage): m Treat string as multiple lines. That is, change ``^'' and ``$'' from matching at only the very start or end of the string to the start or end of any line anywhere within the string, s Treat string as single line. That is, change ``.'' to match any character whatsoever, even a newline, which it normally would not match. The /s and /m modifiers both override the $* setting. That is, no matter what $* contains, /s without /m will force ``^'' to match only at the beginning of the string and ``$'' to match only at the end (or just before a newline at the end) of the string. Together, as /ms, they let the ``.'' match any character whatsoever, while yet allowing ``^'' and ``$'' to match, respectively, just after and just before newlines within the string.	[reply] [d/l]
Re: Apply regex to entire file, not just individual lines ? by juahonen (Novice) on May 24, 2000 at 17:22 UTC
After you've opened and read the file (or web page) into an array, join all lines with join(). open(FILE, "$filename"); @lines = <FILE>; close(FILE); $content = join('', @lines); After this, $content will be single-line and it is easy to do regexp with your existing functions.	[reply]
Re: Apply regex to entire file, not just individual lines ? by vxp (Pilgrim) on Aug 16, 2002 at 16:14 UTC
You might not want to have your WHOLE file in one variable. Depending on the size of the file, it could eat a LOT of your memory. From my own experience, it is usually enough for me to do $/ = '\n\n' and then the linebreak is 2 new lines, not one. I was parsing a bounce file when I was doing this, which was about 300megs in size, daily. thats a LONG 300mb line. $/ = '\n\n'; took care of it. i ended up with having.. smaller big lines, and was able to do what I wanted to do without consuming a lot of RAM.	[reply]
Re: Apply regex to entire file, not just individual lines ? by dsb (Chaplain) on Jan 24, 2001 at 02:49 UTC
The key is two get the whole file into one scalar( the first 'while' loop). Then the 'g' modifier ( the condition in the second 'while' loop ) will keep the place of the last match found and continue from there until there are no matches found. `open( FH, "filename" ) \|\| die "couldn't open\n"; while ( <FH> ) { $data .= $_; } while ( $data =~ m/PATTERN/g ) { # executed code # executed code...etc. }` [download] -kel	[reply] [d/l]
RE: Apply regex to entire file, not just individual lines ? by KM (Priest) on May 25, 2000 at 08:28 UTC
If the only trouble you are having is that it isn't writing to a file is that you are not printing to a filehandle. Look at the open() docs (perldoc -f open) and perlopentut to learn the different ways to open a file and write to it. Cheers, KM	[reply]
RE: Apply regex to entire file, not just individual lines ? by perlcgi (Hermit) on May 25, 2000 at 15:09 UTC
Remember that if the unwanted stuff appears more than one per line you'll need a `/g` to match globally. `$lines =~s/^unwantedstuff//gsm`	[reply] [d/l] [select]


Don't ask to ask, just ask
	PerlMonks