Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Finding C++ single inheritance occurrances

by John M. Dlugosz (Monsignor)
on May 18, 2011 at 19:39 UTC ( [id://905549]=note: print w/replies, xml ) Need Help??


in reply to Finding C++ single inheritance occurrances

It's been pointed out that the grep usage doesn't make any sense, and it doesn't return the class names it would find.

But the expression itself is odd, too. As you can see from toolic's explanation, it grabs everything after the class keyword up to the next : character. It doesn't make use of \w or other identifier-forming knowledge, even though \w is used later.

After the ':', it finds word characters, whitespace, and angle brackets. Why angle brackets and whitespace? To handle templates. So something like public map <int, char> can be parsed as a base class. Oh, wait! The comma is not allowed so this won't parse! The comma is left out of the set, because multiple inheritance looks like public name1, private name2 and the idea I guess was that if it contains a comma before finding the '{' then it's not SI.

But that's not the case for templates with more than one parameter.

It would also be confused by comments and any non-standard keywords and modifiers that are used on that line. It requires the whole thing to be on one line. It won't work at all if preprocessor macros are involved.

It requires the stuff to be followed by a '{', which will match actual definitions, but not (just) declarations.

As for the criteria you mention: You already saw that you could use struct as well as class. You can have other visibility keywords than public or none at all. Matching just what you showed doesn't tell you that there's no ", base2" following it!

The two strings separated by a :: isn't useful. Classes might be qualified to the point of containing exactly one use of ::, or not, or any number of them. Are you looking at the last :: to find out the base name (as str2) and the qualifications used (as str3)? Then the front part needs to be optional, if there is no :: mentioned at all.

The devil is what you allow in each <str>, since allowing anything will let through all kinds of junk.

In general, it cannot be done this way, since it requires a parse of the grammar and not a simple pattern. But it could be made to work well enough for the actual cases you have.

To allow it to be more fault-tolerant, I suggest you program your tool to report on all occurrences of class/struct it finds, along with the determination of "yes" (it is SI), "clearly no", and "can't really understand it". That way you can review the results and make sure it's not missing something.

If I were to do this without elaborate parsing, I'd start by removing comments and funny extended keywords. Then replace <…> template arguments with a simple token, and then use a pattern similar to those shown already. But it needs to handle declarations that span across lines, and since you have not found them yet, the pre-conditioning shouldn't introduce any false positives or other artifacts that would mess up the next step. But slurp the whole file as one string, pre-condition it, and search for all matches treating line-breaks as whitespace.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://905549]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-25 20:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found