Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

My array element won't print outside a loop

by lampros21_7 (Scribe)
on Aug 09, 2005 at 10:31 UTC ( [id://482163]=perlquestion: print w/replies, xml ) Need Help??

lampros21_7 has asked for the wisdom of the Perl Monks concerning the following question:

Hi to the fellow monks, my problem is that i am stripping all the HTML content from a website and saving it in the first element of an array. When i use the print command to see if it really works it works a bit weird. When i leave the print command in the loop and run the script i input a website and the stripped conent is printed. If i take the print command outside the loop it won't print anything.
#Create an instance of the webcrawler my $webcrawler = WWW::Mechanize->new(); my $url_name = <STDIN>; # The user inputs the URL to be searched my $uri = URI->new($url_name); # Process the URL and make it a URI #Grab the contents of the URL given by the user $webcrawler->get($uri); #Use the HTML::TokeParser module to extract the contents from the web +site my @stripped_html; my $x = 0; my $content = $webcrawler->content; my $parser = HTML::TokeParser->new(\$content); while($parser->get_tag){ $stripped_html[0] = $parser->get_trimmed_text(),"\n"; print $stripped_html[0]; } exit;

Here i have left the print $stripped_html[0]; and it works. If i take that command outside the loop it wont print anything.Any ideas?Thanks in advance

Replies are listed 'Best First'.
Re: My array element won't print outside a loop
by broquaint (Abbot) on Aug 09, 2005 at 10:58 UTC
    You need to append to your array element instead of assigning to it anew in every loop iteration e.g
    while($parser->get_tag){ ## Note the . before = and the use of . instead of , $stripped_html[0] .= $parser->get_trimmed_text() . "\n"; } print $stripped_html[0];
    See. perlop for more info on the .= operator.
    HTH

    _________
    broquaint

Re: My array element won't print outside a loop
by GrandFather (Saint) on Aug 09, 2005 at 11:04 UTC

    When I run a version of your code that prints inside and outside the loop I get the following result (last few lines only):

    Inside >Wonderful Web Servers and Bandwidth Generously Provided by < Inside >pair Networks < Inside > < Inside > < Inside > < Inside > < Inside > < Inside > < Outside > <

    Do you really mean to overwrite element 0 of the array with each line in succession in the loop so that only the last line is retained?

    Probably what you want is to replace the loop and print with this:

    push @stripped_html, $parser->get_trimmed_text()."\n" while($parser->g +et_tag); print join "", @stripped_html;
    Update: provide a solution

    Perl is Huffman encoded by design.
      Basically i want to write the content of the first link i have into the first element of the array, my array starts from 0 and then i want to check another URL and write that URL's contents into the next element so that would be stripped_html1. Am not quite sure what you get when you run the program with the print statement inside and outside the loop but i think you get what you want when its inside and nothing is written when its outside. I dont want to have an element of the array for every word.Hope this helps.

        broquaint's reply is what you want then. You could write it as:

        $stripped_html[$urlNum] .= $parser->get_trimmed_text() . "\n" while $p +arser->get_tag;

        where $urlNum is incremented for each URL

        Note that the code that I gave in my first reply puts one line in each element - not one word in each element.


        Perl is Huffman encoded by design.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://482163]
Approved by Nevtlathiel
Front-paged by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-23 10:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found