hacker has asked for the wisdom of the Perl Monks concerning the following question:

I'm designing a portal that allows a user to configure a URL and some metadata associated with it in their user account, such as depth, maximum colors, follow-offsite links, and so on.

This portal stores URLs with this metadata in several tables. One of these contains "keywords" for each URL in the system (roughly 886 URLs thus far). As the user enters a new URL not in the system, they can enter keywords in that record, so others searching, can find it. The list of keywords is comma-separated in the URL entry form.

My question is, how do I determine that a list of keywords contains duplicates, and remove them, or prompt the user to remove them? For example:

my @keywords = ('this','that','other','foo','this', 'bar'); ^dupe

Obviously this contains a dupe 'this', which should be noted and removed, or pointed out to the user.

Another example:

my @keywords = ('this','that','other','foo','THIS', 'bar'); ^uc(dupe)

Also a dupe, 'THIS', though uppercase.

I initially thought lowercasing each word parsed from the list passed, and walking the list, comparing each to the word prior to it, but I don't think that will work for longer lists of keywords.

Has anyone done this before? Any insight as to how this should be designed?


This code now appears to do what I want, thanks to all who have helped arrive at a solution.

use strict; my (@keywords, %keywords); @keywords = ('this', 'that', 'other', 'foo', 'THIS', 'bar'); @keywords{map lc,@keywords}=(); @keywords = sort keys %keywords; foreach my $word (@keywords) { print $word . "\n"; }

Also, thanks to castaway on CB, this is also in 'perldoc -q duplicate', How can I remove duplicate elements from a list or array?