Re: Being Forced to Fork with Nested Regular Expressions

I know that it really doesn't address your topic, but I think the reason that you're running into the problem is that it's the wrong approach. An impressive (ab)use of regular expressions, though.

What I've seen of predicate systems leads me to a database approach rather than regular expressions. Given the right data structure, you need to write two parsers: one to convert $facts into rows in tables, and one to convert your queries into SQL. With the right data structure, this is straightforward.

Most of this can be done with three tables: One to contain people, things and attributes (Merlyn, Ovid, Kudra, books, gold, valuable), one to contain verbs ( owns, is ), and one to contain tuples of person/thing/attribute, verb, person/thing/attribute. This leads to something like:

SELECT actor 
  FROM relations 
  WHERE verb = 'owns' 
    AND acted_on IN ( 
      SELECT actor 
        FROM relations 
        WHERE verb = 'is' 
          AND acted_on = 'valuable' 
      );
[download]

This query, with proper data, can answer the question of 'who owns something valuable' in a very reasonable amount of time. You still do code generation, but now it's SQL instead of RE.

HTH,
Paul

--
Spring: Forces, Coiled Again!

Comment on Re: Being Forced to Fork with Nested Regular Expressions Download Code

Replies are listed 'Best First'.
Re: Re: Being Forced to Fork with Nested Regular Expressions by Ovid (Cardinal) on Jan 02, 2004 at 18:11 UTC
With DBD::SQLite, this might be a viable option if I can figure out how to structure this. I've seen this discussed before and, if it works, I can push the hard part of matching down to the C level and get a huge performance boost. I'll have to see how robust the SQLite SQL implementation is. As I recall, it's pretty good (and here's an instance where I think a typeless database like SQLite is fine). Cheers, Ovid New address of my CGI Course.	[reply]
Re: Re: Re: Being Forced to Fork with Nested Regular Expressions by paulbort (Hermit) on Jan 02, 2004 at 20:43 UTC
Yeah, typeless is no problem in this case. I don't know SQLite at all, but as long as it can do sub-selects ( the IN ( SELECT ...) part ), it should be fine. I'm not sure when you say 'structure this' if you mean the database, or the Perl code. The Perl code probably needs to be set up as a shell, where you can make assertions (which will be written to the database) ask questions (which will query the database), and do housekeeping (save db, restore db, wipe db, quit, etc.) Like a PROLOG interpreter. The only interesting thing I'd worry about with the database to start with would be using serial numbers instead of names when describing relationships, so that you can rename things if necessary. (Without this, renaming breaks all relationships.) Next might include quantities ( book worth 5 shekels ). Note that you don't need to add classes to the database, you can simply add the 'is a' relationship, and ( 1984 is a book ) works. -- Spring: Forces, Coiled Again!	[reply]
Re: Re: Being Forced to Fork with Nested Regular Expressions by Ovid (Cardinal) on Jan 02, 2004 at 21:41 UTC
A bit of thought tells me that there are some problems with this approach. Basically, you have a main table as follows: `+----------+ \| Facts \| +----------+ \| verb \| \| subject \| \| object \| +----------+` That limits me to simple things such as "owns Ovid gold". However, what if I want to express "gives Ovid books kudra"? In this case, there's an implied prepositional phrase "to kudra" that doesn't fit and this tremendously limits the utility of this approach. My other option would be to create a recursive tree structure with arbitrary depth. I've done that before and it's not fun, but I can see no other way to approach this. Cheers, Ovid New address of my CGI Course.	[reply]
Re: Re: Re: Being Forced to Fork with Nested Regular Expressions by paulbort (Hermit) on Jan 05, 2004 at 17:46 UTC
Your OP didn't mention prepositions, so I didn't even think about them. At that point you're really looking for something more like a natural language parser. A quick wander through CPAN finds Lingua::EN::Tagger which might help. The post giving an example of a Prolog interface from Perl is also worth looking at, as all the heavy lifting has already been done in Prolog. Something else that might help would be to look at the MUSH code, which had a rich object-oriented environment, and could at least store the kind of state information you're describing, and much more. The queries would be interesting. PennMUSH seems to be the current MUSH implementation of choice. -- Spring: Forces, Coiled Again!	[reply]


Keep It Simple, Stupid
	PerlMonks