Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Finding problem with finding keywords in arrays / files

by akirostar (Novice)
on Sep 17, 2006 at 10:23 UTC ( [id://573392]=perlquestion: print w/replies, xml ) Need Help??

akirostar has asked for the wisdom of the Perl Monks concerning the following question:

Hi all lords of perl,

I am new at perl and wish to seek information about how i can solve my problem, as the code follows shows, I have 2 files(one of data and the other keyword) I have created 2 arrays using these files and using the keyword array i wish to seek matching words from my data array using it. But cant seem to do a match using this statement if ($element =~ m/$tempwd/i)? Can any wise advisors guide me on this? thanks
#!/usr/local/bin/perl -w use strict; open(INPUT, "data.txt") || die "Error opening 'keywd.txt'\n"; my @data = <INPUT>; open(FH, '<', "keywd.txt") || die "Error opening 'data.txt'\n"; my @keywd = <FH>; my $keywdnum = $#keywd; while (my $element = pop @data) { my $i = 0 ; chomp($element); while ($i <= $keywdnum-1) { my $tempwd = ""; $tempwd = $keywd[$i]; if ($element =~ m/$tempwd/i) { print $i; $i++; } else { $i++; } } } close INPUT; close FH;

data.txt
analysis data power seek true
keyword.txt
analysis analyzing applications co-principal completed computer computing consulting data

Replies are listed 'Best First'.
Re: Finding problem with finding keywords in arrays / files
by GrandFather (Saint) on Sep 17, 2006 at 11:09 UTC

    The statement matches fine. I've reworked your code a little to be a little more Perlish and compact (and to avoid files for test code), but the regex is still there as in your code and it does work. Consider:

    #!/usr/local/bin/perl -w use strict; my @data = qw(analysis data power seek true); my @keywd = qw(analysis analyzing applications co-principal completed computer computing consulting data ); for my $element (@data) { for my $i (0 .. $#keywd) { if ($element =~ m/$keywd[$i]/i) { print "$i\n"; } } }

    Prints:

    0 8

    However, if you are not interested in the element index, but just in testing for the match, I'd be more inclined to use a hash:

    ... my %keywdHash; @keywdHash{@keywd}= (); # Use a hash slice to initialise the hash for my $element (@data) { if (exists $keywdHash{lc $element}) { print "$element\n"; } }

    Prints:

    analysis data

    DWIM is Perl's answer to Gödel
      thx to all great advisors.. i looking into the code now.. thanks for the help!
Re: Finding problem with finding keywords in arrays / files
by shmem (Chancellor) on Sep 17, 2006 at 10:57 UTC
    Are you trying to confuse yourself?
    open(INPUT, "data.txt") || die "Error opening 'keywd.txt'\n"; ^^^^----------- wtf? -------------^^^^^

    Same here. You use 3 argument open here, you should do the same in the previous open, just to be consistent:

    open(FH, '<', "keywd.txt") || die "Error opening 'data.txt'\n";
    but you'd better write it like with $!
    open(FH, '<', "keyword.txt") || die "Error opening 'keyword.txt': $!\n +"; ^^-- sic! error message in here -----------^^

    to get the error (No such file, permission denied, ...) along.

    You can chomp an entire array

    chomp(my @data = <INPUT>);
    and avoid the chomp($element) further down.

    And you do a chomp on $element, but not on the contents of the @keywd array. So, your $element never matches e.g. analysis\n. <update> You require there be a newline in $element - but there isn't one, 'twas chomped off. </update>The other way round there would have been a match.

    BTW, your code formatting is pretty messy..

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      thx shmem, as i still new to perl, still trying to grasp the idea. As I had extracted this part of the problem frm the bigger program I m doing. Any chance you know of any idea to form vectors using perl? I am trying to do a program where I could be reading from various files to form vectors from these files after it is compared with a master keyword vector. thanks once again.

        I do... but that really depends on what you mean by Vector ;-)

        Since you are working with words, I guess you are trying to put up a search engine. This article might be a good starter. You might want to have a look at Plucene as well.

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Finding problem with finding keywords in arrays / files
by coreolyn (Parson) on Sep 17, 2006 at 10:49 UTC

    This probably isn't THE most efficient way to do this but it does what you are trying to achieve.

    #!/usr/local/bin/perl -w use strict; open(DATA, "data.txt") or die "Error opening 'data.txt' $!\n"; my @data = <DATA>; open(KEYWORDS, '<', "keyword.txt") or die "Error opening 'keyword.txt' + $!\n"; my @keywd = <KEYWORDS>; close DATA; close KEYWORDS; foreach my $data ( @data ) { chomp($data); foreach my $keyword (@keywd) { chomp($keyword); if ( $data =~ /$keyword/i ) { print "Match \$data = $data and \$keyword = $keyword\n"; } } }
      chomp(my @data = do { open my $f, "data.txt" or die "Error opening ``data.txt'': $!"; <$f> });
      Hi core, I tried what you have written, with regards to the IF loop, the program will enter it no matter how, is it possible to jus t enter into the loop during the actual matching? thanks.
Re: Finding problem with finding keywords in arrays / files
by davido (Cardinal) on Sep 17, 2006 at 16:26 UTC

    Here is how I would probably do it, assuming your keyword matches are literal (not patterns).

    use strict; use warnings; open my $keyword_fh, '<', 'keywords.txt' or die $! my %keywords; while ( <$keyword_fh> ) { chomp; $keywords{ $_ } = ''; } close $keyword_fh; open my $data_fh, '<', 'data.txt' or die $!; while( my $line = <$data_fh> ) { chomp $line ; my @found = grep { exists $keywords{ $_ } } $line =~ m/([^\W\d_]+)/g; print "Line $.: @found\n"; } close $data_fh;

    This assumes 'words' are entirely alphabetical (no embedded characters such as '-' or " ' " (apostrophe)). It could be modified to deal with those too.


    Dave

      This is nice.. Thanks sensei.. learn something again. Any chance you know how to use perl to manipulate vectors to XML?
Re: Finding problem with finding keywords in arrays / files
by Anonymous Monk on Sep 17, 2006 at 10:53 UTC
    #! /usr/bin/env perl use strict; use warnings; use Tie::File; use List::Compare; tie my @data, 'Tie::File', "data.txt" or die $!; tie my @analysis, 'Tie::File', "analysis.txt" or die $!; print "$_\n" for List::Compare->new(\@data,\@analysis)->get_intersecti +on;
Re: Finding problem with finding keywords in arrays / files
by Persib (Acolyte) on Sep 17, 2006 at 17:00 UTC
    #!/usr/local/bin/perl -w use strict; open(my $input, '<', 'data.txt') or die $!; my @data = <$input>; open(my $fh, '<', 'keyword.txt') or die $!; my $keyword = do { local $/; <$fh> }; my @match = grep $keyword =~ /$_/, @data; print for @match;

    What do you think ?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://573392]
Approved by vagnerr
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-03-29 11:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found