Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
It's not clear (at least, not to me) what you mean by "cleaning whitespace from source code". The first interpretation that comes to my mind would be something like "removing all whitespace that is not syntactically significant to the compiler/interpreter" for each progamming language in question.

If this is what you mean, there's still the question of what you want to do with comments in the source code (remove them entirely, or just normalize whitespace?); presumably, you'll need to be able to identify the beginnings and endings of quoted strings, so you can leave the enclosed whitespace as is (assuming you don't want to change the output of the program as a side effect of "cleaning" the source code). In any case, you're probably going to need something like Parse::RecDescent, which is, effectively, the perl version of "yacc".

It'll be a challenge, and I wish you luck, but once you work out how the rules are stated, and figure out the rules you need for each language, switching from one rule set to another should be pretty trivial.

If you mean something else by "cleaning whitespace", it would be hard for me to guess what that might be.


In reply to Re: Cleaning Whitespace from Source Code by graff
in thread Cleaning Whitespace from Source Code by hackdaddy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-24 03:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found