Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

comment on

( [id://3333] : superdoc . print w/replies, xml ) Need Help??
The background here is this story, which was on slashdot as well as NPR and major wire services yesterday; for those without the time to read it, a professor caught up to 120 students (out of around 500) cheating on term assignments by comparing their electronically-submitted essays for 6 or more word phrases that were repeated in multiple papers. Those found cheating despite the school's honor code will be either denied their diplomas or have their diplomas revoked if they're already graduated.

While the story is somewhat chiling, I also wonder exactly how the professor approached the programming part of this. I'm very much doubting he used perl... :-)

Given two English text strings, $a and $b, and two integers $m and $n, 0 < $m <= $n. Both $a and $b have been stripped of punctuation and converted to lower case, leaving all characters as either ('a'..'z') or the space ' '.

Find the perl golf solution (fewest # of characters in code) that returns a list of phrases with at least $m but no more than $n words that are in both $a and $b.

update changed "$m < $n" to "$m <= $n"; shouldn't affect the golf solution, but makes sense if you want to find repeated phrases of only one size. Eg, if $m=$n=1, you could find all single words in common with both strings.

Dr. Michael K. Neylon - || "You've left the lens cap of your mind on again, Pinky" - The Brain

2001-05-10 Edit by Corion: Fixed title

In reply to Golf (Inspired): Repeated Phrases by Masem

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.