http://qs321.pair.com?node_id=452105


in reply to Extract Lotus Notes Mail to HTML

You should be having the Domino server do the Domino->HTML conversion for you. You're going to throw away all the formatting when you use this method. At least change your title so it is obvious this is a brain-damaged implementation.

Use ->GetFirstDocument/->GetNextDocument instead of ->GetNthDocument, always.

You are using particularly ineffecient loop code. Instead of ->GetNthDocument(), use ->GetFirstDocument/->GetNextDocument. The former is going to actually perform an internal loop starting from 1 every time to get that nth numbered document. It is the most horribly inefficient method you could choose. So knock it off. The author of this function has publicly apologised for ever inflicting it upon the world. You'd only know this if you were reading the Lotus Notes development forums.

my $doc = $AllDocuments->GetFirstDocument; while ( $doc ) { # Fetch the next document prior to running code to guard against s +omeone deciding to delete $doc. my $NextDoc = $AllDocuments->GetNextDocument( $doc ); ... $doc = $NextDoc; }

A sample of what ->GetNthDocument does, in the nnotes.dll C code

sub GetNthDocument { my ( $Collection, $n ) = @_; return undef if $n < 1; my $doc = $Collection->GetFirstDocument; my $ix = 1; while ( $ix < $n and $doc ) { $doc = $Collection->GetNextDocument( $doc ); ++ $ix; } return $doc; }

Try not to lose formatting by smashing RichText to plain text

Since you have the cooperation of a Domino server, you could have fetch ed the document and its formatting from the server using LWP or such. Consider this an outline for a future attempt at fetching this. I'm doing some of these property calls from memory so there's a good chance I'm slightly off here.

sub Doc2HTML { my ( $doc ) = @_; my $db = $doc->{Parent}; my $filepath = $db->{FilePath}; my $server = $db->{Server}; # This is a bit of magic. I'm requesting the '0' view which will b +e whatever view in the db was designated "default". Also, all views c +an be used to fetch any document if you already know the document's U +NID. my $view = '0'; my $unid = $doc->{UniversalID}; return LWP::Simple::get( "http://$server/$filepath/$view/$unid" ); }

$doc->{Body} or fetching of other RichTech items may yield truncated data

In Domino, ordinary text values aren't allowed to get longer than 64K. I recall there's some issues with stuff being stored in a double-byte encoding so you get somewhere around a 32K effective limit. RichText items are allowed to hold up to 4 GB by contrast. For version 4 and 5, there is no way via Win32::OLE to work around this. Use the C or C++ API to extract non-truncated data from RichText values in that case. I once wrote a C++ program for dumping non-truncated text from RichText fields. I will post the source for this later.

I will post the R6-ish way of doing this later.