Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Removing nested div Tag from HTML

by metaperl (Curate)
on Aug 18, 2011 at 20:06 UTC ( [id://921062]=note: print w/replies, xml ) Need Help??


in reply to Removing nested div Tag from HTML

Are you trying to extract it or delete it? Either way, HTML::Tree is far simpler than what you are doing. The look_down method is all you need. Here's an article on scanning HTML

So basically, you just need to:

  1. create an instance
  2. parse the HTML
  3. my $element = $tree->look_down( ... critera .. )
  4. $element->delete
The specifics depend on exactly what you want to keep versus delete, but it's a piece of cake.



The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development.

-- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"

Replies are listed 'Best First'.
Re^2: Removing nested div Tag from HTML
by mr_p (Scribe) on Aug 18, 2011 at 20:13 UTC
    Wouldn't HTML::Tree take too much time loading? The material I am working with is time critical.

      Wouldn't HTML::Tree take too much time loading?

      T.I.T.S.

      The material I am working with is time critical.

      Can't be that critical, if you're using perl

        I can not get cpan to install HTML::Tree::Scanning. It says. Also, my HTML::Tree is up to date.

        Try the command i /HTML::Tree::Scanning/ to find objects with matching identifiers.

        I tried the HTML::Tree, but It only gets me the content of the tag. I would need to pull out the whole tag with all tags inside it and remove a specific tag based on attribute.

        Is it possible to do this with HTML::Tree?

        Thanks

        I got the HTML::Tree too work and replace tags that I don't want.

        Thanks

        I would still like it to know what was wrong with my code and findout why I can't do this with HTML::Parser.

        Do you know how I can extract the tag? I was able to remove the tag from html but not extract it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://921062]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2024-04-23 18:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found