in reply to one liner to print out sorted list of word
perl -ne '$_=lc;s/\W+/ /g;@w{split /\s+/}=();END{$,="\n";print sort ke +ys %w}'
That expanded becomes:
while (<>) { # for each input line $_=lc; # lowercase s/\W+/ /g; # maps all (seqs of) non-word chars to space @wl=split /\s+/; # take the words of this line @w{@wl}=() # put them as keys into a hash (undef values) } $,="\n"; # separate 'print' args with a newline print sort keys %w; # print the sorted keys
Since hash keys are unique, it does what you need.
On the command-line, without using Perl, you do it this way:
tr -cs '[:alnum:]' '\n' < textfile |tr '[:upper:]' '[:lower:]'|sort|un +iq
The two trs perform the "cleaning" and "lowercasing" in a locale-dependent fashion. To have the same with the above Perl one-liner, add a -Mlocale before the -ne
-- dakkar - Mobilis in mobile
|
---|
In Section
Seekers of Perl Wisdom