in reply to acessing the data from word(.doc) file in linux environment
word file in linux environment
There seem to exist a lot of different options with very dfferent complexities, paired with different word-formats.
If it's a plain old word 2000-2003 file and you already know what your tables look like and you need only some data from within some cells, you could do simply a:
and then:$> abiword --to=rtf myworddocument.doc
$> perl extract-table-cells.pl myworddocument.rtf
in the latter (extract-table-cells.pl), you would simply search for:
[pseudo] ... # table content part already extracted to $tablecontent @cells = $tablecontent =~ /} ([^}]*) }\\cell{/xgs; ...
which might give you the cells in @cells.
But it depends on your problem. Of what scale and purpose is your attempt?
Regards
mwa
In Section
Seekers of Perl Wisdom