Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Extracting Comments

by nofernandes (Beadle)
on Jun 20, 2003 at 08:16 UTC ( [id://267481]=perlquestion: print w/replies, xml ) Need Help??

nofernandes has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone...
I need your wise advice and suggestions!!

I need to I make a program which the main objective is to extract the comments on source code in order to see if the programmers did cumply with one of the principles of development of my company! The principle of commented code, for a better understanding of it, for later upgrading or bugs fixing!

In order to do that i must be capable of extracting the comments of any type of source code(PERL, JAVA, PL/SQL, ProC, C++, etc)!! I have tested a Module(Regexp http://search.cpan.org/author/ABIGAIL/Regexp-Common-2.113/lib/Regexp/Common/comment.pm) that is a very nice and helpfull tool for this kind of work!! But the problem is that the regexes donīt preview all the cases!

For example in Perl, a comment starts with #.
#This is a comment print "# This is not a comment"; qw/ # Neither is this/ @array= ('#', "or this"); ?#array #or this
And this module catches all the above examples!!
Other problem is the multiline coments in Java or C/C++;
/* This is a comment*/ /*This is also a comment */
Or
/** This is * a * comment for javadoc! **/
Other idea is to have the following structure:
> program.pl source.pl ParserPl.pl
Where the source file is the file to be analyzed, the ParserPl.pl is the file where are the regexes to match the comments and the program.pl is the file to process the results of the parsing...
This idea seems good because in this way i may construct diferent parsers for each language!!
ex: > program.pl source.c ParserC.pl > program.pl source.java ParserJava.pl > etc...
My question is if you have any sugestion to do this kind of program??
Thanks for your help, sorry my poor English!!
Best Regards

2003-06-21 edit ybiC: <code> and evidently intended <p> tags

Replies are listed 'Best First'.
Re: Extracting Comments
by particle (Vicar) on Jun 18, 2003 at 18:38 UTC

    well, for perl, you could try Pod::Coverage, but they'd have to use pod to comment. you might get some mileage with Acme::Comment. it's original intent is to filter out comments of many different styles, although you may be able to extract the comments instead of the source instead.

    i'd be interested in seeing your work, in progress, and completed. neat project, good luck!

    ~Particle *accelerates*

Re: Extracting Comments
by aquarium (Curate) on Jun 20, 2003 at 08:56 UTC
    i don't think you can judge a book by it's cover, ie just because some program is not commented doesn't mean it's not understandable. For example, my code is very simple, as I come from a C background and modular functions school. If anyone can't understand my Perl code, it's because they don't understand Perl, and shouldn't be trying to evaluate it on that basis. What you should have in the first place (hopefully getting written and updated as a program is created) are standard design docs: requirements spec, any implementation spec eg flowchart/functional decomposition/data flow diagram, and a data dictionary. If you haven't got these then how can you tell if a program is doing what it's meant to. With code, in particular, what is more important than a count of comment lines is a measure of code obscurity. I do produce docs when I write any code that's going to be roughly more than 50 lines of code, or is quite complex. Also have gotten into the habit of putting a comment line just above where I process longer bits of data, spelling out it's structure. There is no replacement for code clarity. So I would rather go through nice code without comments rather than deciphering obfuscated code and its' comments.
Re: Extracting Comments (asked and answered)
by particle (Vicar) on Jun 20, 2003 at 12:53 UTC
      perhaps he's hoping that someone will just write it for him to take....as he's not participating in the posts at all.

        that may well be the case. however, as i'm sure you're aware, the chances of that happening are miniscule.

        ~Particle *accelerates*

        Of course iīm not expecting that someone gives me the code!!! Are you crazy!?? I just want an opinion!!

        And i did not replied back because i could not see the replys!! I thank you for all your help, and i must clear one thing out..

        This program is not to punish programmers ou something like that..!! My company produces lots of code, and upgrades lots of it.. This upgrade is made by many diferent people.. so the comments should work has a way of easily understanding the code, by explanning what each method does, why it was build ou rebuild, by who and when!! This is almost strictly for maintenance of code!!!!!

        Another thing is where can i find examples with the module Parse::RecDescent?

        Thank you all for your participation and help!

        Nuno

Re: Extracting Comments
by castaway (Parson) on Jun 20, 2003 at 12:10 UTC
    I can see the need for development standards and comments.. In fact we have standards that require a comment block at the top of each file, and before each function, explaining what it is, which project it belongs to, who wrote it, etc. Nothing wrong with that at all. Tho having said that, I don't see much point in extracting comments without the code they're supposed to annotate.

    Whatever, you'll probably need a code parser to extract *just* the comments, pretty difficult as in perl the quote operators can use almost anything as delimiters, you'd actually have to check for keywords etc. Parse::RecDescent maybe? (Sounds crazy though) - Or just update Regexp::Common::comment your own Regexes?

    C.

Re: Extracting Comments
by zentara (Archbishop) on Jun 19, 2003 at 15:43 UTC
    perltidy has an extensive set of options for comment control, but it may only help you with the perl scripts.
Re: Extracting Comments
by Lachesis (Friar) on Jun 20, 2003 at 09:38 UTC
    Well written code can be clear enough to not need comments or at least a minimal amount of commenting.
    Human code review is a far better means of checking adherence to any standards. And good man management is the best way of getting your team to keep to the standards.
    If you rely on a script to check for comments you could find that your programmers will find tricks to bypass it eg slotting in arbitrary comments in the code. Commented code is not necessarily readable code, its just code with comments in it.
    If you do want to carry on with analyzing the source you might want to look at a grammar based parser Parse::RecDescent
Re: Extracting Comments
by graff (Chancellor) on Jun 21, 2003 at 02:47 UTC
    Make sure you don't down-grade someone's Perl code because they happen to use pod format for documentation (which is not only sexy, but extremely useful) instead of "# comments" (which are often not as good as pod).
Re: Extracting Comments
by bdimych (Monk) on Jun 23, 2009 at 16:03 UTC
Re: Extracting Comments
by ScooterQ (Pilgrim) on Jun 18, 2003 at 18:01 UTC
    Have you considered Regexp::Comment?
    CPAN - there's something there for everyone!

    Doh! I really should slow down and read more carefully! I suggested that he try something he's already tried.... my bad. Sorry.
      He states in his post that he has, and it does not work exactly how he wants it to. I would say to expand that module to the point that it works for your needs and then send a diff back to the author of the changes you make so as to let them decide if any of your mods warrent inclusion in the module.

      -Waswas

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://267481]
Approved by mce
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-20 01:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found