http://qs321.pair.com?node_id=603301


in reply to regex not working properly

$row =~ s#\s+$##; looks to be the problem. the $ means it will only match at the end of the string. if you want to remove all spaces:

$row =~ s/\s+//g; # Remove only spaces in other wise empty records $row =~ s/\|\s+\|/||/g; # Same but add a zero even for empty items $row =~ s/\|\s*\|/|0|/g;
(code is untested)

They say that time changes things, but you actually have to change them yourself.

—Andy Warhol

Replies are listed 'Best First'.
Re^2: regex not working properly
by Anonymous Monk on Mar 05, 2007 at 22:43 UTC
    $row =~ s/\|\s+\|/||/g;
    works perfect. Thank you!
      If you ever end up handling a line that looks like this:
      |foo|bar|||baz|
      I think you'll want a slightly more complicated regex -- something like:
      s{ (?<! [^|] ) \s* (?! [^|] ) }{0}gx;
      That uses negative look-behind and look-ahead assertions, so that a string of zero or more spaces will match (and be replaced by "0") if it is neither preceded nor followed by some character other than a pipe symbol. (That is, if the whitespace string is preceded or followed by something other than a pipe symbol, it won't match, and won't be replaced.)

      The phrasing seems a bit obtuse, but the point is that a pipe symbol in line-initial or line-final position should probably cause a zero to be inserted, and when three or more pipes occur in sequence, you probably want zeros between all of them. Your simpler version for removing whitespace between two pipes won't handle those cases very well.

      Personally, I prefer split for this sort of thing:

      $row = join "|", map { s/^\s*$/0/; $_ } split( /\|/, $row, -1 );