![]() |
|
Problems? Is your data what you think it is? | |
PerlMonks |
Re: Extracting text from MS Word files on a Linux boxby aitap (Curate) |
on Jun 21, 2018 at 13:31 UTC ( #1217117=note: print w/replies, xml ) | Need Help?? |
If text is all you need (no formatting), you may have success with piping from Antiword (and docx2txt for later versions of the format). LibreOffice Writer used to support the older .DOC format better than the newer .DOCX; the situation may have changed since, but in general case, you should assume that you are going to lose some formatting information.
In Section
Seekers of Perl Wisdom
|
|