comment on

a|b||d becomes a|b|\N|d
|b|c|d becomes \N|b|c|d
a|b|c| becomes a|b|c|\N
and similarly,
a|b|.|d becomes a|b|\N|d
but
.|b|c|d does not become \N|b|c|d
a|b|c|. does not become a|b|c|\N
Is that a bug?

If the above is a bug, the following regexps are probably faster:

s/\s*\|\s*/\|/g;
s/^\.?(?=\|)/\\N/;
s/(?<=\|)\.?(?=\||$)/\\N/g;
s/(?<=\d{2}:\d{2}:\d{2})\.\d+//g;
s/(?<=\d{5})-(?:\d{1,4}|\s+)//;
[download]

If the above is not a bug, the following regexps are probably faster:

s/\s*\|\s*/\|/g;
s/^(?=\|)/\\N/;
s/(?<=\|)(?=\||$)/\\N/g;
s/(?<=\|)\.(?=\|)/\\N/g;
s/(?<=\d{2}:\d{2}:\d{2})\.\d+//g;
s/(?<=\d{5})-(?:\d{1,4}|\s+)//;
[download]

I reduced the number of regexps by combining a few, I shortened the regexps by removing the spaces first (not last), and I used zero-widths positive lookaheads and lookbehinds to mimimze the text being captured and substituted.

Use this in conjuction with the -p or -pi suggestion for better results.

In reply to Re: Multiple substitutions in large files by ikegami
in thread Multiple substitutions in large files by mdi

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Do you know where your variables are?
	PerlMonks