Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

how would you detect a math expression

by Anonymous Monk
on Feb 18, 2007 at 15:43 UTC ( [id://600703]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I was having a discussion with a friend on google calculator, and it soon turned to how google could detect it was a math expression and not a query, 2+4/6*8
I think it should be possible using a regex, but after an hour of trying, I still cant seem to replicate it. Does anyone have any ideas?
Then the bigger question is how does it also detect the conversions:
12 * 12oz. in gallons
the speed of light in km/fortnight
Any ideas?

Replies are listed 'Best First'.
Re: how would you detect a math expression
by ambrus (Abbot) on Feb 18, 2007 at 17:23 UTC

    Google goes for sure: in most cases when you type an expression it does both the calculation and the search. It omits the search only in very clear cases. Note that an equals sign can force calculation in some cases: two dozen dozen is search only, two dozen dozen = is both search and calculation.

    Also note that the calculation is not done when it's dimensionally incompatible: 3 pounds to kilograms or 3 pounds to dollars are valid calculations, while 3 pounds to meters isn't. On the other hand, just like some handheld calculators, google doesn't care about some missing parenthesis or multiplication signs: 3 + 3) (7 + 5 is a well-formed calculation.

Re: how would you detect a math expression
by liverpole (Monsignor) on Feb 18, 2007 at 16:09 UTC
    I've never tried Google calculator, but it doesn't seem like it should be too difficult to detect at least fairly simple math expressions.

    Something like:

    # Assuming expression is in $string my $num = '-?(\.\d+|\d+(\.\d+)?)'; my $ops = '[-+*/]'; if ($string =~ /^(${num}${ops})+${num})$/) { # It's a simple math calculation } else { # It's something else }

    Now that's pretty basic, but it should parse expressions of the form number op number [op number ...] for numbers containing an optional leading "-", and at least one digit (with possible decimal point), and for any of the basic operations {+, -, *, /}.

    For anything more complicated than that, you'll have to ask someone with a more intimate knowledge of Google calculator than I.


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: how would you detect a math expression
by gam3 (Curate) on Feb 18, 2007 at 16:11 UTC
    here is a very simple example of how this might be done.
    $text = 'speed of light in km/fortnight'; if ($text =~ /(.*)\s+in\s+(.*)/) { print "1: ($1) ($2)\n"; } $text = '2+8/2*3'; while (length $text) { if ($text =~ m|(\d+)\s*([/*])\s*(\d+)|) { print "($1) $2 ($3)\n"; $x = eval " $1 $2 $3 "; $text =~ s|\d+\s*[/*]\s*\d+|$x|; } else { last; } } while (length $text) { if ($text =~ m|(\d+)\s*([+-])\s*(\d+)|) { print "($1) $2 ($3)\n"; $x = eval " $1 $2 $3 "; $text =~ s|\d+\s*[+-]\s*\d+|$x|; } else { last; } } print "answer $text\n";
    This will not work for your example because it does not recognise floating point numbers.
    -- gam3
    A picture is worth a thousand words, but takes 200K.
Re: how would you detect a math expression
by bart (Canon) on Feb 18, 2007 at 17:45 UTC
    1. Detect whether there are only acceptable characters
    2. Use a parser. I built a simple one, easy done (and customizable) without any modules, in Operator Precedence Parser.

    As an extra, you can use it to calculate the value of the expression, though, if it parses, you can most likely safely use eval — it depends on what you accept as input.

Re: how would you detect a math expression
by planetscape (Chancellor) on Feb 19, 2007 at 08:11 UTC

    In Parse::RecDescent Tutorial, Jeffrey Goff builds a parser that can handle simple expressions like 3 + 5. The article may give you some ideas on how Google calculator works its magic.

    HTH,

    planetscape
Re: how would you detect a math expression
by shmem (Chancellor) on Feb 18, 2007 at 21:01 UTC
    Just eval it, and if you get some numeric output, it's likely to be a math expression :-)

    To detect conversions - define the associated units as constants, and interpolate a multiplication, e.g. s/h/ * 3600/ to normalize hours and calculate with seconds. A hash table unit => constant is handy for that...

    (just a half-baked idea .)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

      Sounds more like a dangerous idea.. try Googling for system('rm -rf /'), see if it takes their entire system down.

      I know that's a very extreme example, and would require a very badly set up system, but bear in mind you've just allowed your random passer-by to do anything the user the webapp's running as could do.

      Running eval() on user input is a very dangerous hobby, much better to use Parse::RecDescent to parse it into a sane, and safe, mini-languase.

        Sounds more like a dangerous idea..

        Uh-oh. Of course you're right; I forgot to mention that before that, a double fork and subsequent chroot /dev/null is required :-)

        Good point, Molt++

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: how would you detect a math expression
by jbert (Priest) on Feb 19, 2007 at 15:59 UTC
    Google also gets it a bit wrong (that's the problem with heuristics, and in-band signalling in general). There's a big health initiative in the UK to get people to eat more portions of fruit and vegetables.

    Here's a search to find out more: 5 a day

      Intriguing. Reminds me of that old US game show that I've never seen for real, but only through frequent reference in other tv shows and movies: Jeopardy

      If

      5.78703704 × 10-5 hertz

      is the answer, what was the question?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        5 a day
      I dunno; I think Google got it precisely right. You gave it a frequency in non-standard units, and it responded by displaying the frequency in Hertz.

      Google++.

      And it did offer to execute the phrase as a search.

Re: how would you detect a math expression
by toma (Vicar) on Feb 20, 2007 at 07:25 UTC
    Take a look at the code for the venerable unix command 'units'. Here are some samples:
    You have: c
    You want: km/fortnight
            * 3.6262896e+11
    
    You have: (1/454) pound
    You want: gram
            * 0.99910214
    
    The Google algorithm looks like it could well have started from the 'units' code base.

    I used to have a CGI form where you could solve an algebra problem. It used http://maxima.sourceforge.net behind the scenes, with lots of input scrubbing. I took it down after someone kept trying to factor very high order polynomials, which had the effect of a denial of service attack. I could have fixed this by limiting the CPU time for any problem.

    It should work perfectly the first time! - toma
Re: how would you detect a math expression
by stonecolddevin (Parson) on Feb 19, 2007 at 20:10 UTC

    GOOGLE KNOWS ALL.

    ++ to this post btw.
    meh.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://600703]
Approved by Joost
Front-paged by Limbic~Region
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (None)
    As of 2024-04-18 23:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found