Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Writing a parser

by Anonymous Monk
on Apr 24, 2003 at 20:02 UTC ( [id://252985]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Text parsing, and the parser it rides in on. I've never written one before but have need of one now, to parse the output of a software tool I use in my research. Parse::RecDescent is way overkill, and besides I think it would be a good learning experience to write my own. So, this question goes out to anyone who has written a fairly robust and effective parser module: are there any perls of wisdom that you will share concerning the kinds of things to keep in mind, pitfalls, tests to conduct, features to include, implementation-independent steps to take etc. in constructing a module for parsing a text file?

Best regards.

Replies are listed 'Best First'.
Re: Writing a parser
by eduardo (Curate) on Apr 24, 2003 at 20:09 UTC
    I have written a parser or two. I have a few suggestions if you plan on writing your own parser.
    1. Prepare to devote the next few months if you plan to write a real, generalized, marginally useful parser. This is hard.
    2. Buy This book. Ingest all of it. Then, you are ready to write a useful parser.
    Now, if you plan on writing a simple method to "parse files for your research", get our fellow monk davorg's book Data Munging with Perl. You won't really leave with the knowledge required to chose between LR and Recursive descent parsing depending on the Chomsky's hierarchy level of your grammar... but you will probably be able to do something quite useful. If you are planning on parsing, you shouldn't dismiss Parse::RecDescent so quickly as overkill, it's pretty sweet. And, with the advent of Perl 6, it will pretty much be "in core"...
Re: Writing a parser
by BrowserUk (Patriarch) on Apr 25, 2003 at 13:12 UTC

    The first thing to do would be to access the complexity of the data to be parsed, along with the volumes and performance requirements.

    Use Parser::RecDescent would probably be too slow and overkill if your need is to parse large quantities of CSV data, but would be perfect for tokenising a complex language with recursive elements.

    A indication, and a small sample of the data to be parsed would get you a better range of responses and possible solutions.


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://252985]
Approved by lacertus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (4)
As of 2024-04-19 04:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found