http://qs321.pair.com?node_id=334720

nicholas has asked for the wisdom of the Perl Monks concerning the following question:

Hello All,

How can I get the page and line count of MS-Word document. I am using win32::ole package.

Code example will be highly appritiated. I hope i will get the answer. Please ASAP

while(($object = $enum->Next)) { $para_temp++; $paragraph = $object->Range->{Text}; $allText .= $paragraph; chop $paragraph; unless ($paragraph eq '') { $paragraph_count++; } chomp $paragraph; @words = split /\s+/, $paragraph; $nwords += @words; @chars = split //, $paragraph; $nchars += @chars; }
Thanks Nicholas

Edited by Chady -- added code tags.

Replies are listed 'Best First'.
Re: How to get MS-Word Page count and line count?
by Zero_Flop (Pilgrim) on Mar 08, 2004 at 07:17 UTC
    Hi Nicholas,
    Acctually you should post the code showing were you are. It is hard to judge how much information you need based on your request.

    What you probably need is pagenumbers.count To find this all I did was open Word. Hit Alt-F11 and this opens the editor for VBA. From there you can look in the object browser, or in the help. You may also want to look up the perldoc for Win32::OLE to get an example of how to make the connection.

    If you post the code you have I'm sure you will find that everyone can get you to the point you want.
      Try this code. It's untested, and I never get it right if you need the {} around Count or not.

      my $doc = $word->Application->Selection; # Selection can be the entire doc or refined to a sub-set of text # It is easier to use the functions that exist in word to do the work. $num_paragraphs = $selection->Paragraphs->{Count}; $num_words = $selection->Words->{Count}; $num_chars = $selection->Characters->{Count}; $num_pages = $selection->pagenumbers->{Count};
      As for the line count, your going to have a problems here. Word probably does not mark new lines as you would expect. It automaticly wraps based on the available space. So instead of looking for line count you could do Sentence->Count.
        My bad!

        As you probably know TIMTOWTDI. When I had time to test what I had posted earlier I discoverd that it is generating raw statistics, counting EOL and everything. I am assuming that you want the statistics that are availabile from the properties of that document. The code below will do what you want.

        I used the fuctions available within the application to do the work. This will usually work better for you when you are comunicating between applications.

        And thanks for posting the code!
        use strict; use Win32::OLE qw(in with); use Win32::OLE::Const 'Microsoft Word'; use Win32::OLE::Enum; # Here you have to call the file with "Name.pl C:\dir\file.doc" # This will allow you to point the script anywere rather than # having to put the script into the dir with the file. die "Usage: perl doc_print.pl test.doc" unless @ARGV == 1; my $Word = Win32::OLE->new( 'Word.Application', 'Quit' ) || die "Could +n't run Word"; my $Doc = $Word->Documents->Open(@ARGV[0]) || die "File does not exist + or can not be opened"; # Selection can be the entire doc or refined to a sub-set of text # It is easier to use the functions that exist in word to do the work. my $num_pages = $Doc->ComputeStatistics(wdStatisticPages); my $num_paragraphs = $Doc->ComputeStatistics(wdStatisticParagraphs); my $num_lines = $Doc->ComputeStatistics(wdStatisticLines); my $num_words = $Doc->ComputeStatistics(wdStatisticWords); my $num_chars = $Doc->ComputeStatistics(wdStatisticCharacters); my $num_charWs = $Doc->ComputeStatistics(wdStatisticCharactersWith +Spaces); printf "Page Count %d\n", $num_pages; printf "Character (paragraphs) %d\n", $num_paragraphs; printf "Line Count %d\n", $num_lines; printf "Character (words) %d\n", $num_words; printf "Character (with spaces) %d\n", $num_charWs; printf "Character (wout spaces) %d\n", $num_chars; $Doc->Close;
        I tried but $num_paragraphs, $num_words, $num_chars all returns 1 and $num_pages ($selection->pagenumbers->{Count}; ) raising the error.

        No one here to give me some solution for this. I don't think it is tuff tasks for perlmonks???
      This is the full code:
      use strict; use Text::Wrap; use Win32::OLE qw(in with); use Win32::OLE::Const 'Microsoft Word'; use Win32::OLE::Enum; die "Usage: perl doc_print.pl test.doc" unless @ARGV == 1; my $File = $ARGV[0]; $File = Win32::GetCwd() . "/$File" if $File !~ /^(\w:)?[\/\\]/; die "File $ARGV[0] does not exist" unless -f $File; my $Word = Win32::OLE->new('Word.Application', 'Quit') or die "Couldn't run Word"; my $Doc = $Word->Documents->Open($File); my (@words, $nwords, @chars, $nchars, $len, $object, $paragraph, $para +graph_count, $enum, $allText, $para_temp); my $lc=0; my $pc=0; my (@char, $char); $enum = Win32::OLE::Enum->new($Doc->Paragraphs); allText = ''; $para_temp = 0; while(($object = $enum->Next)) { $para_temp++; $paragraph = $object->Range->{Text}; $allText .= $paragraph; chop $paragraph; unless ($paragraph eq '') { $paragraph_count++; } chomp $paragraph; @words = split /\s+/, $paragraph; $nwords += @words; @chars = split //, $paragraph; $nchars += @chars; } $allText =~ tr/\t //d; $len = length($allText) - $para_temp; printf "Character (no spaces) %d\n", $nchars; printf "Character (wout spaces) %d\n", $len; printf "Character (words) %d\n", $nwords; printf "Character (paragraphs) %d\n",$paragraph_count; printf "Line Count %d\n", $lc; printf "Page Count %d\n", $pc; $Doc->Close;
      Just let me know how can i count the $lc(line count) and $pc (page count)?

      Thanks Nicholas

      Edited by Chady -- added code tags.

Re: How to get MS-Word Page count and line count?
by jmcnamara (Monsignor) on Mar 08, 2004 at 22:38 UTC

    A Word document can compute its own page and line count as can be seen on the File->Properties->Statistics page. These values can also be calculated via the Word object model as follows:
    #!/usr/bin/perl -l use strict; use Win32::OLE; use Win32::OLE::Const 'Microsoft Word'; my $word = Win32::OLE->new('Word.Application'); my $doc = $word->Documents->Open('c:\temp\test2.doc'); die "Unable to open document ", Win32::OLE->LastError() unless $do +c; my $pages = $doc->ComputeStatistics(wdStatisticPages); my $lines = $doc->ComputeStatistics(wdStatisticLines); my $words = $doc->ComputeStatistics(wdStatisticWords); print "Pages:\t", $pages; print "Lines:\t", $lines; print "Words:\t", $words; $doc->Close; __END__
    I couldn't get the Word Const to work under -w without generating warnings (although it works fine under warnings). So here are the required constants just in case:
    wdStatisticWords = 0 wdStatisticLines = 1 wdStatisticPages = 2 wdStatisticCharacters = 3 wdStatisticParagraphs = 4 wdStatisticCharactersWithSpaces = 5 wdStatisticFarEastCharacters = 6

    --
    John.