Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I'm designing a portal that allows a user to configure a URL and some metadata associated with it in their user account, such as depth, maximum colors, follow-offsite links, and so on.

This portal stores URLs with this metadata in several tables. One of these contains "keywords" for each URL in the system (roughly 886 URLs thus far). As the user enters a new URL not in the system, they can enter keywords in that record, so others searching, can find it. The list of keywords is comma-separated in the URL entry form.

My question is, how do I determine that a list of keywords contains duplicates, and remove them, or prompt the user to remove them? For example:

my @keywords = ('this','that','other','foo','this', 'bar'); ^dupe

Obviously this contains a dupe 'this', which should be noted and removed, or pointed out to the user.

Another example:

my @keywords = ('this','that','other','foo','THIS', 'bar'); ^uc(dupe)

Also a dupe, 'THIS', though uppercase.

I initially thought lowercasing each word parsed from the list passed, and walking the list, comparing each to the word prior to it, but I don't think that will work for longer lists of keywords.

Has anyone done this before? Any insight as to how this should be designed?

Update:

This code now appears to do what I want, thanks to all who have helped arrive at a solution.

use strict; my (@keywords, %keywords); @keywords = ('this', 'that', 'other', 'foo', 'THIS', 'bar'); @keywords{map lc,@keywords}=(); @keywords = sort keys %keywords; foreach my $word (@keywords) { print $word . "\n"; }

Also, thanks to castaway on CB, this is also in 'perldoc -q duplicate', How can I remove duplicate elements from a list or array?


In reply to Detecting duplicate keywords passed in a form by hacker

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-19 04:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found