Re^10: Looking for a module that strips an HTML tag and its associated 'TEXT'

Almost, not quite. Need this:

# strips a specific tag from string
sub eliminate_tags {
  my ($page, $tag) = @_;

my $dom = Mojo::DOM->new;
foreach my $b ($dom->parse($page)->find($tag)->each) {
  $b->remove
}
  return $dom;
}
[download]

So my beef it that: 1) I'd have to be familiar enough with Mojo::Dom to figure out it could do this (I'm not) so I needed to come PerlMonks to find someone like you to help and 2) I have to spend 20 min. wading through monstrous documentation to figure out how to use it for something simple.

So why isn't a specific tool better than a general purpose tool? You're saying a specific tool is inferior because it has more dependencies?

$PM = "Perl Monk's";
$MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks

Comment on Re^10: Looking for a module that strips an HTML tag and its associated 'TEXT' Download Code

Replies are listed 'Best First'.
Re^11: Looking for a module that strips an HTML tag and its associated 'TEXT' by marto (Cardinal) on Jul 29, 2020 at 15:31 UTC
Sometimes you may have to write some code yourself... I posted a general example, given your example data, simply to show you how it can be done. The place is for helping people learn, people aren't always going to do it all for you. If you are writing code, reading the documentation for the tools you are going to use is literally the bare minimum you can do. 20 minutes to learn how to use something as portable and powerful as this seems insignificant compared to the productivity gains, and the alternative of writing all of the required code to do this properly yourself. If you think this is "something simple" you don't understand the scope of the problem at all. Look what you have achieved with this in 20 minutes! What is "monstrous" about the Mojo::DOM docs? It's littered with examples for how to use each method practically, with before and after data displayed. "So why isn't a specific tool better than a general purpose tool? You're saying a specific tool is inferior because it has more dependencies?" You tried two, neither of which did do what you imagined they would, and seemingly didn't take into account their dependencies. Having spent a long time working with perl and data, solving problems and writing code I suggested what I've found to be a tried and tested method that takes the pain out of what you're trying to do, in this instance. I didn't say anything about any other specific tools or problems.	[reply]
Re^12: Looking for a module that strips an HTML tag and its associated 'TEXT' by nysus (Parson) on Jul 29, 2020 at 15:42 UTC
Monstrous meaning "large and unwieldy". Yes, Mojo::DOM is well documented and no doubt a high-quality module, but there are so many method calls, it's difficult to sort through them. This slows you down. I don't know if I said this was "something simple" but it is a problem that seems to common enough to have been solved my another module, especially in Perl that grew its legs during the dawn of the www. Pulling up a web page in a browser is an extraordinarily difficult task when you look under the hood but it should be simple to perform. I would have expected a basic task like this to be readily available in an off-the-shelf module. Indeed, there are existing modules but they were either a) buggy or b) didn't meet my specific needs. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar"; $nysus = $PM . ' ' . $MCF; Click here if you love Perl Monks	[reply]
Re^13: Looking for a module that strips an HTML tag and its associated 'TEXT' by marto (Cardinal) on Jul 29, 2020 at 16:02 UTC
"Something simple" was a direct quote, but I think we are perhaps talking at cross purposes. I now believe (and correct me if I'm wrong) you mean that this is something seemingly so fundamental that it should exist as something anyone could call, without having to write the code you quickly put together to achieve this. Opposed to my reading that the task itself was simple. When I first encountered Mojo::DOM I recall just reading it from start to finish, and it really didn't take that long, and I don't consider myself a fast reader by any means. From time to time I find myself looking it up either to refresh my memory or to link to something for someone else. Yes there are many methods, that's part of the power of this and I'd say I've taken advantage of most, if not all of them. Sure at first I accept, "this slows you down", but maintain that this shouldn't be seen as a negative thing. After my short pointer, and your 20 minutes you now have the code you need, a basic understanding of how Mojo::DOM works and what it can do, and are considering publishing something to make this easier on yourself and others in future. Seems like a win-win to me. Perhaps sounding out the Mojo team as to weather you could simply provide a PR with a new method for Mojo::DOM to provide your desired functionality would be welcomed.	[reply]
Re^14: Looking for a module that strips an HTML tag and its associated 'TEXT' by nysus (Parson) on Jul 29, 2020 at 16:37 UTC
Re^11: Looking for a module that strips an HTML tag and its associated 'TEXT' by marto (Cardinal) on Jul 29, 2020 at 16:19 UTC
If you want to reduce the number of lines a little: `my $dom = Mojo::DOM->new( $html ); $dom->find( $tag )->each( sub { $_-> remove } ); return $dom;` [download]	[reply] [d/l]


Perl-Sensitive Sunglasses
	PerlMonks