http://qs321.pair.com?node_id=521954


in reply to multi line regex

This is wrong in so many ways. First of all, you're parsing HTML with a regex. Don't do that. Use HTML::Parser instead.

Otherwise, there are just too many ways in which you can be tripped - tags with extra white space, tags with newlines, quotes missing or present in unexpected places, escaping of this, that or the other thing, javascript code fooling you into thinking you're in another tag when you really aren't, etc.

Second, you're trying to extract data from an HTML table using regex. Don't do that. Use HTML::TableExtract instead. It will save you a LOT of hairpulling.

Replies are listed 'Best First'.
Re^2: multi line regex
by metalfan (Novice) on Jan 18, 2006 at 17:51 UTC
    looks good, sorry for this question: but how can i use this to
    do geht the word in the first column?

    1.column | 2.column
    english word | german word
    ....

    thx for help
      Read the manual pages for HTML::TableExtract - once it parses the table, the first column will be the first element of the row array.