Re^4: joining words

Replies are listed 'Best First'.
Re^5: joining words by bliako (Monsignor) on Dec 19, 2020 at 23:09 UTC
i removed span and div, because i never wanted them And what about this? `<table> <tr> <td><span>Omonoia 1948</span></td> <td><span>Apoel</span></td> <td><span>3-0</span></td> </tr> </table>` [download] your regex removing spans will remove all content from above table. My 2nd point is that there are 2 tables in the example URL you posted. Why are you not specific about which table you want to process, first or second? That's sloppy. Really sloppy. Bad. You have to type the code. My 3rd point is that you are trying to parse html fetched from a website. Your code fails for some reason. But the site is hit and delivers the html. Then you make a modification to your code. Then you try again ... by asking the site again to give you the same (format, not content because in the meantime the score may be 4-0) HTML so that you try again your new regex or whatever. This can be done 15 times per minute. The same URL hundreds of times until you finally make your table parsing correct, fingers crossed. But the website admininstrators get angry. Everyone gets angry. "Just find the correct hole goddamit" -- keyhole that is. They are asked by management to install new measures to stop your "attack". You created a lot of hassle. We don't want that. So, why not download the webpage once, save it to a file and then keep trying your regex-tricks or whatever on that local html file, no need to hit the site again and again and again. BTW, are you the one hitting MY site all day long??????? over and out bw, bliako	[reply] [d/l]
Re^5: joining words by Bod (Parson) on Dec 19, 2020 at 22:14 UTC
I cannot get my head around what you are trying to say...but...there is no need to remove the `<span>` and `<div>` tags any more than you do any of the rest of the page's HTML. `use strict; use LWP::Simple; my $html = get("http://example.com"); while ($html =~ /<td>(.+)?<\/td>/gc) { print $1."\n"; } # untested as written on my mobile` [download] This will fetch a webpage and extract and print the content of every `<td>` tag. No need to strip anything out first or to make more than one request to the webserver.	[reply] [d/l] [select]
Re^6: joining words by afoken (Chancellor) on Dec 19, 2020 at 22:35 UTC
I cannot get my head around what you are trying to say Don't waste your time. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]


Just another Perl shrine
	PerlMonks