Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re: regular expression (search and destroy)

by davido (Cardinal)
on Nov 12, 2003 at 20:56 UTC ( #306631=note: print w/replies, xml ) Need Help??

in reply to regular expression (search and destroy)

Insisting on using a Regular Expression to parse CSV data, and refusing to look at modules that have been well written for such tasks, is a bit like going to the mechanic and insisting on having your car's engine removed using sky hooks and telekinesis instead of engine lifts and elbow grease.

If the core module (that comes with Perl) for handling balanced text is unavailable to you because your professor requires that you not use it, at very least look under its hood at its source code so that you can understand how it works rather than trying to build your own internal combustion engine with coke-cans and superglue.

"Junkyard Wars" is on a different channel, this is Perl. ;)


"If I had my life to live over again, I'd be a plumber." -- Albert Einstein
  • Comment on Re: regular expression (search and destroy)

Replies are listed 'Best First'.
Re: Re: regular expression (search and destroy)
by boo_radley (Parson) on Nov 12, 2003 at 22:42 UTC
    a bit like going to the mechanic and insisting on having your car's engine removed using sky hooks and telekinesis instead of engine lifts and elbow grease.
Re: Re: regular expression (search and destroy)
by data67 (Monk) on Nov 12, 2003 at 21:13 UTC
    I actually agree with you surprise. But mabe you did'nt read the question properly. I was not able to install them!! My suggestion is stop looking at this question like a csv problem. all i need to know is how to do s sub function in the middle of a regex.

    example :

    s/ "(.*)" #search for something that is enclosed by quotes. / Now take $1 and run a sub search replace on it. /e;

    What is so bad about doing this? I already know that if i have a quoted data in my file then it is bound to have a comma in there!. Now i'll say this, if i cant do it this way then i would have to wait for our sysadmin to come back and install the relevant modules.

      I cannot enumerate all the difficulties associated with trying to parse a string where delimiters between quotes should be treated as text and not delimiters. But here are a few problems...

      • How do you provide for escaping delimiters aside from the quoting?
      • How do you provide for allowing quotes that are acting as text items rather than as quoting mechanisms?
      • How do you allow for nested levels of quote like characters?
      • Is ' the same as ", the same as `, the same as ....?

      Those reasons and others make it much better to parse such entities using a "balanced text" module. And what sauok was telling you was that Text::Balanced is a "core" module, meaning, a module that if you have Perl, you have that module already, without waiting for a sysadmin to install it.

      And what others have also suggested is that even if you can't use a CORE module (a module that comes with Perl, just like stdio.h comes with C), at very least you can use your web browser to view the source of a module, on the CPAN website. Then, you can have a look at what tools are used to accomplish your objective.

      Otherwise, you're wasting your efforts and building a broken solution.


      "If I had my life to live over again, I'd be a plumber." -- Albert Einstein
      Just to be crystal-clear: Do you realize that a standard perl installation comes with some modules already installed? They are called 'core' modules, because they're there by default.

      Pardon me for saying so, but you don't seem to be listening to what people are telling you. Also, as hardburn correctly pointed out, installing modules in user directories is a FAQ.

        The overall fear of using non-core modules by Perl novices doesn't completely surprise me, after all, it can be intimidating learning to install one for the first time, and figuring out its interface. And those people who haven't tried just don't know how easy it often is, what they're missing and how much of a time savings it can be, not to mention the benefit in robustness brought by using one of the more popular modules that has withstood the refining fires of common use and frequent update.

        But what really baffles me is how one can scoff at the use of a core module. A C programmer almost instinctively puts

        #include stdio.h #include stdlib.h the top of his program (except where he doesn't need them). Sure, one can write a custom made standard input/output library for C, but the one that comes with C is ubiquitously used, and seldom shunned just on the basis of "not wanting to use any libraries".

        Perl handles standard input and output internally, without the inclusion of a separate module. But there are many modules included with Perl that help to make a programmer's life easier, just as stdio and stdlib save C programmers from having to re-write their own input/output and common-tools libraries over and over again, possibly in non-standard ways, for no good reason. Text::Balanced, (for CGI, unrelated to this question), and many other of the Core modules come to mind as being fantastic resources included with every complete distribution of Perl.

        My personal feeling is that if you choose to not use those modules, you should do so only if you know that the module doesn't contribute what you need. In that case, you would do as a C programmer might do who needs a different behavior than stdio provides; reinvent a wheel. Or more appropriately, you might at that point scan CPAN to see if someone else has already tackled that project and invented a good solution for you.


        "If I had my life to live over again, I'd be a plumber." -- Albert Einstein

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://306631]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2023-05-29 21:55 GMT
Find Nodes?
    Voting Booth?

    No recent polls found