I have this code which reads words from input file, then it should remove all stop words and count words which aren't stop ones. When I have output, it shows me strange result like this:
hello 1
welcome 1
world 1
hello 1
our 1
page 1
to 1
welcome 2
world 1
What am I doing wrong here and how I am suppose to change this code to work properly? There is my code down below.
#!/usr/bin/perl
use strict;
use warnings;
use Lingua::StopWords qw(getStopWords);
print "Enter the name of your input file: ";
chomp( my $file = <STDIN> );
my %found;
open my $fh, '>', 'output2.csv' or die "Can't open this file: $!";
open my $fh2, '<', $file or die "Can't open this file: $!";
my $stopwords = getStopWords('en');
while (my $line = <$fh2>) {
my @words_all = split /\s+/, $line;
$found{$_}++ foreach split /\s+?/, $line;
my @words_nostop = grep { !$stopwords->{$_} } @words_all;
#print {$fh} join( ' ', @words_nostop ), "\n";
print $fh $_, "\t\t", $found{$_}, $/ foreach sort keys %found;
}
close $fh2 or die "Can't close file: $!";
close $fh or die "Can't close file: $!";
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.