Re: Regular expressions across multiple lines

Welcome to the Monastery abcd

The first assignement seems easy; why check for every char with the dot and not just newline ?

perl -E "say q(found ), $count=()=qq(abcdefab\ncdefa\nbcdef) =~ /a\n?b
+\n?c/gm, q( [abc] occurences)"

found 3 [abc] occurences
[download]

By other hand the description you gave of your code, does not make so much sense to me (and you probably missing use strict; and use warnings; ).

 while ($line=<inputfile>){chomp $line; $string=$string.$line;}
[download]

Infact what i understand is that you are accomulating every new line into $string and attempting the match for every generated string: so for a 100 lines file you are actually examining 5050 lines. this can be a problem.

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Comment on Re: Regular expressions across multiple lines Select or Download Code

Replies are listed 'Best First'.
Re^2: Regular expressions across multiple lines by abcd (Novice) on Apr 24, 2016 at 17:44 UTC
Thanks, i will try this but I am new to programming so not sure i understand your code. What I was doing in my code was to simply chomp every line and append it to the end of a string so I get a single string containing everything without any newlines which I then search.	[reply]
Re^3: Regular expressions across multiple lines by Marshall (Canon) on Apr 24, 2016 at 18:01 UTC
Please clarify crystal clear: "but for some reason doesnt work with my actual txt file which is several hundred mb", I am presuming that "slow", maybe many,even tens of minutes is NOT the issue?	[reply]
Re^4: Regular expressions across multiple lines by abcd (Novice) on Apr 24, 2016 at 18:17 UTC
No it is not the time. If I output the string to a txt file it creates the file in a few seconds but when I open the file in a text editor it seems corrupt with characters displaying one on top of another.	[reply]
Re^5: Regular expressions across multiple lines by afoken (Chancellor) on Apr 24, 2016 at 19:23 UTC
Re^5: Regular expressions across multiple lines by Marshall (Canon) on Apr 24, 2016 at 18:41 UTC
Re^6: Regular expressions across multiple lines by Anonymous Monk on Apr 24, 2016 at 18:57 UTC
Re^6: Regular expressions across multiple lines by Anonymous Monk on Apr 24, 2016 at 18:57 UTC
Re^3: Regular expressions across multiple lines by Discipulus (Canon) on Apr 24, 2016 at 19:00 UTC
you welcome, even if i'm not sure to understand your issue. Basicly `a\n?b\n?c` means match `a` followed by, perahps `?` a newline `\n` followed by a `b` followed by, perahps `?` a newline `\n` and a `c` The `m` regex modifier (probably unneeded in my example) stands for multiline and the `g` one means globally ie all occurences are returned. `$count=()=$string=~/pattern/g` idiom is used to count the occurences of `pattern` in `$string` infact `$string=~/pattern/g` with the `g` returns a list and the generic list `()` is provided and it's scalar value (ie the number of elements) is returned to the scalar `$count` For shortness i put your example data into a doublequoted string using qq operator `qq(abcdefab\ncdefa\nbcdef)` the rest is only print stuffs. If you want to slurp a file into a string you can play with `$/` aka input record separator, see `perlvar` and How do I read an entire file into a string? L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re^3: Regular expressions across multiple lines by pryrt (Abbot) on Apr 24, 2016 at 19:33 UTC
assuming you're really only looking for a short string, and given the size of your file, I would be tempted to only concatenate the new line with the last few nonspace characters from the previous line(s), and do the comparison every loop.	[reply]


The stupid question is the question not asked
	PerlMonks