perlquestion
hacker
I'm designing a portal that allows a user to configure a URL and some metadata associated with it in their user account, such as depth, maximum colors, follow-offsite links, and so on.
<p>This portal stores URLs with this metadata in several tables. One of these contains "keywords" for each URL in the system (roughly 886 URLs thus far). As the user enters a new URL not in the system, they can enter keywords in that record, so others searching, can find it. The list of keywords is comma-separated in the URL entry form.
<p>My question is, how do I determine that a list of keywords contains duplicates, and remove them, or prompt the user to remove them? For example:
<code>
my @keywords = ('this','that','other','foo','this',
'bar'); ^dupe
</code>
<p>Obviously this contains a dupe 'this', which should be noted and removed, or pointed out to the user. <p>Another example:
<code>
my @keywords = ('this','that','other','foo','THIS',
'bar'); ^uc(dupe)
</code>
<p>Also a dupe, 'THIS', though uppercase.
<p>I initially thought lowercasing each word parsed from the list passed, and walking the list, comparing each to the word prior to it, but I don't think that will work for longer lists of keywords.
<p>Has anyone done this before? Any insight as to how this should be designed?
<p><b>Update</b>:
<p>This code now appears to do what I want, thanks to all who have helped arrive at a solution.<code>
use strict;
my (@keywords, %keywords);
@keywords = ('this', 'that', 'other',
'foo', 'THIS', 'bar');
@keywords{map lc,@keywords}=();
@keywords = sort keys %keywords;
foreach my $word (@keywords) {
print $word . "\n";
}</code>
<p>Also, thanks to [castaway] on CB, this is also in 'perldoc -q duplicate', <i>[http://www.perldoc.com/perl5.6/pod/perlfaq4.html#How-can-I-remove-duplicate-elements-from-a-list-or-array-|How can I remove duplicate elements from a list or array?]</i>