Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: empty out C function bodies

by jakobi (Pilgrim)
on Oct 24, 2009 at 12:51 UTC ( [id://803033]=note: print w/replies, xml ) Need Help??


in reply to empty out C function bodies

If we've permission to mangle the layout of the source with say GNU indent (so we've a canonical format and don't need to really understand C syntax), then try a more elaborate version of the following quick hack:

perl -e 'undef $/; $ENV{f}=$ARGV[0]; $_=`cat -- "\$f"`; s/^(\w+[^\n;]*?\([^\n;]*\n\{\n)[\s\S]+?\n(\}\n)/$1$2/mg; print' FILE.c > FILE_MODIFIED.c

This one-liner slurps the whole file into $_ using a mostly useless cat (or maybe type), then matches a line starting with a word and also containing a parenthesis, followed by a line with a sole { in col1 and non-gready eating until a line with a sole } in col1, in both cases w/o blanks. Use of ^ and /mg instead of \n is required in case multiple function defs occuring w/o empty lines in between.

You can also push the selection of files into Perl (-> glob), as well as reading and writing the modified files (explicitely or with an implicit -> perl -i.bak).

And first try it on a copy of your files at least until both you and the compiler are happy with the output again :)

cu & HTH, Peter -- hints may be untested unless stated otherwise; use with caution & understanding.

Updates: biocc's missing cases:

That was the reason for asking about indent (not that indent wouldn't normally line-break long signatures, but we might hold-out and hope for a parse-friendly option-combination...) & and for my insistence on column 1 for {,}.

It might be easier to smash the source to make it conform to my assumption than to 'harden' the regex. And you probably should stop short of reimplementing the C parser anyway.

s/[\t ]*$//mg; to ensure no (ASCII) whitespace at EOL

s/^(\w+[^\n;]*?\([^\n;]*(?:,\n[\t ]+[^\n;]+)*\)[\t\n ]*{\n)[\s\S]+?\n(\}\n)/$1$2/mg; should do the trick for the two cases you mentioned, requiring the comma at EOL and whitespace at SOL for continuations in multiline signatures.

But this regex begins to be overly cute, so you should probably rewrite it using the /x modifier (-> add comments and whitespace), and maybe split the patterns into multiple separate variables (see perlre).

Extend the regex repeatedly like this, and you've found an indicator that you should have chosen some cpan module or a proper C parsing grammar :). Let me rephrase that in better words than mine:

Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems. (monkquips)

Replies are listed 'Best First'.
Re^2: empty out C function bodies
by jwkrahn (Abbot) on Oct 24, 2009 at 13:37 UTC
    perl -e 'undef $/; $ENV{f}=$ARGV[0]; $_=`cat -- "\$f"`;

    Backquotes in scalar context already do what you want regardless of the contents of $/ and you can use $ARGV[0] directly instead of copying it to the environment, so that then becomes:

    perl -e ' $_=`cat -- "$ARGV[0]"`;

    You could also simplify that whole thing with the use of the -0 switch and the -p switch:

    perl -0777pe's/^(\w+[^\n;]*?\([^\n;]*\n\{\n)[\s\S]+?\n(\}\n)/$1$2/mg' +FILE.c > FILE_MODIFIED.c
      1. Wrong. There's a very important reason for this idiom: you've forgotten the shell's interpolation (might be Unix specific, but it's nonetheless a DEADLY & EASILY EXPLOITABLE TRAP).
        !!!!Please do not do insecure shell invocations like $_=`cat "$ARGV[0]"` ever!!!!
        (unless you control each and every tenth of each bit of each filename character and shell word individually; in case you missed it, it's indeed a major pet peeve of mine. Why you ask: consider rm -rf /* ./* and reinstalls & restores all over the place in huge lans. I don't intend to try "overnight" bare-metal recovery on that order of magnitude, and neither should you)
      2. switches: indeed. But even if we start playing golf, I'm still rather partial to my personal one and only space between -e and the Perl scrap: I greatly fear that you'll win by default :).
Re^2: empty out C function bodies
by biocc (Initiate) on Oct 24, 2009 at 13:11 UTC
    Hi Peter!

    Thanks for the answer!

    Layout is not important. Your regex works in most cases. But it misses:

    static void xmlLinkDeallocator(xmlListPtr l, xmlLinkPtr lk) { (lk->prev)->next = lk->next; (lk->next)->prev = lk->prev; if(l->linkDeallocator) l->linkDeallocator(lk); xmlFree(lk); }

    and

    static void xmlLinkDeallocator(xmlListPtr l, xmlLinkPtr lk) { (lk->prev)->next = lk->next; (lk->next)->prev = lk->prev; if(l->linkDeallocator) l->linkDeallocator(lk); xmlFree(lk); }

    Would it be possible to produce the hack for these two cases?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://803033]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-18 18:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found