Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

If you have control over the database, I'd first of all restructure the database to get rid of the REGEXP $uri part in your SQL statement - simply introduce either a new column called host, or, to completely stay within the relational mindset, introduce a new table in which you store all your hosts, and introduce a new column in tURIMapping, in which you store references to all hosts. The second idea is cleaner in the sense of pure relational databases and normalisation, but the first alternative is much easier to implement.

Searching for the "best" (==longest) match is done easily if you sort the list by the length of the entries in descending order. I'm not sure if you can convince SQL to do this with a clever ORDER BY len() clause, but you could introduce a second column, which stores the length of each URL, and modify your select statement to SELECT ftpID,uri,path FROM tURIMapping WHERE host=? ORDER BY length DESC.

That way, the database will do most of the work for you, and you now only need to walk the results until you find the first string that (partly) matches your searchstring - as the database has ordered your results, you can guarantee that this will be the longest possible match.

Note that, if you decide to split off the hostname from the rest of the uri, you will have to slightly modify the way you construct your return values and the values you put into the query.

For the uri parsing part, I recommend taking a look at URI::URL module, which nicely splits up a lot of obscure uris.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web

In reply to Re: Longest Matching URL by Corion
in thread Longest Matching URL by marceus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-03-29 14:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found