quick glance of the source

You didnt look close enough.

This regex:

q{([^\s]*)\s+([^\s]*)\s+([^\s]*)\s+\[(([^: ]+):([^ ]+) ([-+0-9]+))\]\s ++"(([^\s]+) ([^\s]+)( ([^\s"]*))?)"\s+([^\s]*)\s+([^\s]*)};

Won't match "-", because it expects and requires at least two space delimited fields within the quotes; and allows for a third.

Note also that both ID fields are expected to match [^\s]* (I guess he's not aware of \S; and it should at least be + not *; which could be an indication of his perl experience.).

So, a "proper parser" would break. Maybe it has a back-up plan for if the regex fails; but equally, it's simple to code a back up plan for the white space split also.

So let's review:

  1. The OP posted asked about using pack & unpack, and a couple of early responders posted, with positive sounding confirmations.
  2. I countered by informing him that pack & unpack were completely inappropriate for the task; and suggested split as a starting point in his "personal learning experience".
  3. You pop up and rather than trying to help the op; you attempt to pick holes in my post; despite that its purpose was to save the OP wasting time with pack & unpack.
  4. So, I reminded you: "He did ask for a learning exercise; not a pre-solved solution.".
  5. So you come back with this guess: "(or if Apache really does go to some pains to make sure spaces never show up in the various log fields -- say by always representing them as + or %20 -- then yay, but I'm not sure this is actually true.)".

    Which is demonstrably wrong!

  6. You retort with: "which says nothing about logname and user,".

    Look at the regex above! Wrong again.

  7. And "nor does it guarantee that the HTTP command field always consists of exactly 3 space-separated components ".

    Also wrong!

  8. So then you throw " - - [18/Jun/2015:09:05:55 -0700] "-" 408 0" into the mix.

    And, as I've shown above, that would (without special handling) break most pre-solved solutions; which I'll remind you: the OP explicitly didn't want.

    And which could just as easily be handled by a special case with the split version.

    You know, as a part of the personal learning experience!

    A big part of which might be that having tried it for himself; he'd decides to opt for a pre-solved solution.

    Or he might decide to write his own CPAN module that does it better than any of the existing ones.

    That's his choice.

    All I did was short circuit his learning, by informing him that pack & unpack were definitely the wrong tools to start with.

So, here we are 13 levels deep; and you've become boring. No attempt to help the OP; just banging on about stuff it seems you barely understand.

So, I'm bored and done. T'was fun.

Update: I forgot this little gem. You offered this wishy-washy suggestion "or using Text::CSV or somesuch"; but then later suggest that split will break because "which says nothing about logname and user,"; completely oblivious to the fact that if either ID contained spaces; it would break that module also!

