http://qs321.pair.com?node_id=941850


in reply to conversion from doc to html

Here's a stupid suggestion - why not use Microsoft Word to export as HTML? Pretty sure it can do that.

Replies are listed 'Best First'.
Re^2: conversion from doc to html
by ww (Archbishop) on Dec 05, 2011 at 14:25 UTC
    1) MS Word's conversion to .html has been and still is badly borked (TTBOMK - I haven't checked the latest version), with enormous bloating to add non-standard MS tags. Don't use it, unless you don't care.

    2) Why not learn a little HTML -- an hour or so with a decent tut (w3schools comes to mind) -- and you can do the conversion yourself... by saving as text and adding the necessary tags. Generally, unless the Word .doc is extraordinarily complex, that's a quick and painless operation.

      Some old version of htmltidy had a handy word-2000 option:

      word-2000

      Type: Boolean
      Default: no
      Example: y/n, yes/no, t/f, true/false, 1/0

      This option specifies if Tidy should go to great pains to strip out all the surplus stuff Microsoft Word 2000 inserts when you save Word documents as "Web pages". Doesn’t handle embedded images or VML. You should consider using Word’s "Save As: Web Page, Filtered".

      I've used it some years ago, with sufficient (but not very pretty) results.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re^2: conversion from doc to html
by srocks (Initiate) on Dec 07, 2011 at 05:20 UTC

    it is not working .My word doc has table and image which will create grabage value during conversion ..