Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: (Zigster) MSWORD TO TEXT

by zigster (Hermit)
on Apr 12, 2001 at 19:11 UTC ( [id://72070]=note: print w/replies, xml ) Need Help??


in reply to MSWORD TO TEXT

I use the UNIX command 'strings' it works fine and dandy with most word docs I have come across. The op is a little ruff but in most cases I can read the document. It all depends how clean you want the output.
--

Zigster

Replies are listed 'Best First'.
Re: Re: (Zigster) MSWORD TO TEXT
by Hero Zzyzzx (Curate) on Apr 12, 2001 at 22:50 UTC

    Zigster,
    All I can say about strings is WOW! That works perfectly on Word 2k, WordPerfect 8, and Excel 2k files. Combined with pdftotext you have a nearly complete solution for extracting text from common user docs, which I'm doing for a search engine for a web-based document management site. Just goes to show that if there's something you want to do on Unix/Linux, chances are the tool is already sitting on your hard drive.

      Glad to know it worked for you, I would be very interested in seeing the result when you have completed it. As a full on UNIX head working in a MS world a complete toolset for converting MS docs to ASCII would be of great interest to me. Please msg me when/if you complete the tools.

      Cheers
      --

      Zigster

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://72070]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2024-03-29 09:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found