Re^4: Extracting appropriate language text from HTML data


Pathologically Eclectic Rubbish Lister
	PerlMonks

Re^4: Extracting appropriate language text from HTML data

by UnderMine (Friar)

on May 28, 2006 at 21:55 UTC ( [id://552219]=note: print w/replies, xml )

Need Help??

in reply to Re^3: Extracting appropriate language text from HTML data
in thread Extracting appropriate language text from HTML data

Thanks for that.

I am currently treating each paragraph seperately using panic_languages to back out where no direct translation is available.

You have raised an interesting point in relation to should there be some overall scheme that balences the paragraph readability against document readability. But to do this there has to be a relationship between alternate parts of the text.

The current markup does not show how alternate parts relate but just what language that chunk is in. A better markup would indicate alternate parts and group them together.

Thanks
UnderMine

Comment on Re^4: Extracting appropriate language text from HTML data

Replies are listed 'Best First'.
Re^5: Extracting appropriate language text from HTML data by john_oshea (Priest) on May 29, 2006 at 12:15 UTC
Given your database constraints, I'm not sure that you're going to come up with a 'better' solution. Given that not every chunk is available in all languages, you're (effectively) going to have to decide at each chunk what's going to be the 'best' piece of text to return at that point, and I can't at the moment see a more elegant way of doing that...	[reply]

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://552219]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others contemplating the Monastery: (4)

As of 2024-04-19 23:08 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found