Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

regex 2 match C function

by jai (Sexton)
on Aug 13, 2003 at 08:23 UTC ( #283486=perlquestion: print w/replies, xml ) Need Help??

jai has asked for the wisdom of the Perl Monks concerning the following question:

hi, can anyone give me the regex to match C-style functions. given a C source file, the regex should match all the C functions and add a header b4 the function. the functions may be of the form
int function(int a, char *b) { ... return 0; }
the regex should match:
int function(int a, char *b)
and print:
------------------------------------------------------ /* Function : function() Input: int a : char *b: Output: int : Author : Date : */ int function(int a, char *b) -----------------------------------------------------------
thanks -jai

Replies are listed 'Best First'.
Re: regex 2 match C function
by liz (Monsignor) on Aug 13, 2003 at 08:30 UTC
    I don't think a single regular expression would be the way to go, as that would imply having to read entire source-files into memory.

    Given the "cleanness" of the C-code, I would do this with a while loop, iterating one line at a time, and build some logic on top of that.

    But maybe you would like to have a look at Doxygen instead...


Re: regex 2 match C function
by Skeeve (Parson) on Aug 13, 2003 at 08:31 UTC
    Did you already try something? Then please show us the efforts you made.

    Or do you want us to do your homework? ;-)

Re: regex 2 match C function
by CombatSquirrel (Hermit) on Aug 13, 2003 at 11:27 UTC
    Try this:
    #!perl -w use strict; undef $/; my ($header, $type, $arguments); my $file = <DATA>; $file =~ s/((\w+)\s+(\w+)\s*\(([^)]*)\)\s*\{)/"\/* Function: $3()\n" . " Input:\n" . do { $header = + $1; $type = $2; $arguments = $4 . ','; ''} . (join "\n", map { sprintf " + %-5s %8s : ", split } $arguments =~ m|\s*(\w+\**\s+\**\w+) +\s*,|g) . "\n\n" . " Output:\n" . (sprintf " %-5s : \n", $ty +pe) . "\n Author :\n Date :\n*\ +/" . "\n\n$header"/ge; print $file; __DATA__ int function(int a, char *b) { ... return 0; }

    I am just a little afraid that it will also catch if-blocks and such.
    Hope this helped.
      hi, thanx for the code. actually i came up with the following.
      foreach (@ARGV) { # Read file open(FF,"<$_") || die ("Cant open $_ : $!\n"); my $t=<FF>; close FF; $t=~s/((\**\s*\w+\s*)+\s*)(\((\s*\**\s*\w+,?)*\)?\s*{)/${\&do( +$1,$3)}/g; # Write file open (FF, ">$_.j") || die ("Cant open $_.j : $!\n"); print FF $t; close FF; } sub do($$) { my ($a,$b)=@_; my ($t1, $t2, @t3); $t1=$a,$t2=$b; $a=~s/\n/ /gm; $b=~s/\n//g; $b=~s/\s+/ /gm; $b=~s/[(\{)]/ /g; @t3=split /,/,$b; #@t3=map {"\n *$_\t : " } @t3; @t3=map{sprintf "\n *%-13s : ", $_} @t3; my $t=<<E; /* * Function : $a * * Inputs : \n *@t3 * * Author : * Date : */ $t1$t2 E }
      I know urs is far better. as u said, it will match try, catch blocks and also constructs of type
      #define add(x,y) {\
      I'd like the monks to give suggestions on optimizations,..
Re: regex 2 match C function
by tbone1 (Monsignor) on Aug 13, 2003 at 13:24 UTC
    I know this is doable, so do it until you get it. (HINT: Think curly braces.) A friend wrote something like this when we worked together at Ameritech. However, that was years ago, and I've lost touch with him. I am not about to give away his code without permission. (I'm kinda funny that way.)

    Ain't enough 'O's in 'stoopid' to describe that guy.
    - Dave "the King" Wilson

Re: regex 2 match C function (parser)
by kelan (Deacon) on Aug 13, 2003 at 13:50 UTC

    You're either going to want to slurp the entire file so you can catch when the return type is on a different line, or just write a small parser, or both. I've written small parsers before using the /\G.../gc method explained in perlop and it isn't too difficult. Be careful that you distinguish function definitions and function calls or you could run into trouble.


    Perl6 Grammar Student

Re: regex 2 match C function
by CombatSquirrel (Hermit) on Aug 13, 2003 at 15:47 UTC
    If you don't want the whole file to be slurped in at once
    (only if it is not HW)
    Hope this helped (and I hope your question was not HW, but for work, documentation or something similar as I suppose it to be).
      Hi, the above code will treat static as part of the return type. It would be nice if I get an algorithm for extracting return type with any number of words. For example " static inline unsigned int".

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://283486]
Approved by gellyfish
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2021-10-28 12:27 GMT
Find Nodes?
    Voting Booth?
    My first memorable Perl project was:

    Results (96 votes). Check out past polls.