Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: A Regexp Assembler/Compiler

by Abigail-II (Bishop)
on Jun 19, 2002 at 15:18 UTC ( [id://175719]=note: print w/replies, xml ) Need Help??


in reply to A Regexp Assembler/Compiler

Given a set of regexps and given a wanted logical concatenation (lets start with AND,OR) of these regexps, is there any mechanism to create a smaller, more eficient set of regexps that will have the same effect?

Very unlikely given the power of Perl regular expressions. One of the questions that you would need to answer is whether two grammars (I use grammar here instead of regular expression to avoid confusion later on) are equivalent; that is, whether the will match the same language (where a language is the set of all strings matched by a grammar). For regular expressions (not Perl regular expressions, but for traditional ones) this question is decidable. But for more powerful grammars on the Chomsky hierarchy, this question is undecidable.

However, Perl regular expressions are hard to place on the Chomsky hierarchy. Without (?{ }) or (??{ }), Perl regular expressions cannot be used to create all context free grammars (take the language of strings with balanced parenthesis for instance). On the other hand, it can be used to create grammars that are context sensitive on the Chomsky hierarchy. (With (?{ }) it could very well be that Perl regular expressions are at least as powerful as context free grammars, but I don't see an obvious proof.)

My guts say that such a mechanism does not exist, that the problem is unsolvable. If the problem is solvable, it's going to be incredibly hard.

Abigail

Replies are listed 'Best First'.
Re(2): A Regexp Assembler/Compiler
by gumby (Scribe) on Jun 19, 2002 at 15:29 UTC
    I don't think that there is anything wrong in posing this question in the language of boolean logic and set theory. Any solution set of the above subroutine will be the union of the complement of the solution set of the first regexp and the solution sets of the other regexps. Although finding a regexp whose solution set this is would be non-trivial, as Abigail-II points out.
      On further reflection, it's actually quite likely that algorithms have been developed for similar problems (ie. Boolean algebra's, elimination theory etc.).
Re: Re: A Regexp Assembler/Compiler
by PetaMem (Priest) on Jun 19, 2002 at 20:53 UTC
    My guts say that such a mechanism does not exist, that the problem is unsolvable. If the problem is solvable, it's going to be incredibly hard.

    I agree with Abigails intuition here, but only for the language space perl regexps span in general. Considering the examples I've given above, one sees, that there are regexps (hard ones) and regexps (next to trivial).

    So my guts tell me, that for a certain subset of regexps this problem must have a more or less achievable solution, the more expression you'd like to handle the more "incredibly hard" it will get. But hey - then again it'll be just incredibly possible if you allow me the s///ing of a well known citation. :-)

    Bye
     PetaMem

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://175719]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-26 02:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found