Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Great problem, it could also be used to figure out what portions of a web page change so an bot could rip out stories from news sites. I suggest thinking of this along the lines of a diff. First determine you record delimiter. In a diff the delimiter is a new line. However with regular expressions you might go with white space (or this could be a command line option). Look up diff and use a similar algorithm. Once you find the components that are different look at the differences. Would they both fit in the same character class. Maybe just go down the following list to see which describes both first.
/^[0-9]$/ /^[0-9]+$/ /^[0-9]*$/ /^[0-9A-Za-z]$/ /^[0-9A-Za-z]+$/ /^[0-9A-Za-z]*$/ /^[0-9A-Za-z.,]$/ /^[0-9A-Za-z.,]+$/ /^[0-9A-Za-z.,]*$/ /^.$/ /^.+$/ #Giving up /^.*$/
This might be good for a first pass.
----
I always wanted to be somebody... I guess I should have been more specific.

In reply to Re: generating regexes? by atlantageek
in thread generating regexes? by mortis

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2024-04-19 11:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found