Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

You should be having the Domino server do the Domino->HTML conversion for you. You're going to throw away all the formatting when you use this method. At least change your title so it is obvious this is a brain-damaged implementation.

Use ->GetFirstDocument/->GetNextDocument instead of ->GetNthDocument, always.

You are using particularly ineffecient loop code. Instead of ->GetNthDocument(), use ->GetFirstDocument/->GetNextDocument. The former is going to actually perform an internal loop starting from 1 every time to get that nth numbered document. It is the most horribly inefficient method you could choose. So knock it off. The author of this function has publicly apologised for ever inflicting it upon the world. You'd only know this if you were reading the Lotus Notes development forums.

my $doc = $AllDocuments->GetFirstDocument; while ( $doc ) { # Fetch the next document prior to running code to guard against s +omeone deciding to delete $doc. my $NextDoc = $AllDocuments->GetNextDocument( $doc ); ... $doc = $NextDoc; }

A sample of what ->GetNthDocument does, in the nnotes.dll C code

sub GetNthDocument { my ( $Collection, $n ) = @_; return undef if $n < 1; my $doc = $Collection->GetFirstDocument; my $ix = 1; while ( $ix < $n and $doc ) { $doc = $Collection->GetNextDocument( $doc ); ++ $ix; } return $doc; }

Try not to lose formatting by smashing RichText to plain text

Since you have the cooperation of a Domino server, you could have fetch ed the document and its formatting from the server using LWP or such. Consider this an outline for a future attempt at fetching this. I'm doing some of these property calls from memory so there's a good chance I'm slightly off here.

sub Doc2HTML { my ( $doc ) = @_; my $db = $doc->{Parent}; my $filepath = $db->{FilePath}; my $server = $db->{Server}; # This is a bit of magic. I'm requesting the '0' view which will b +e whatever view in the db was designated "default". Also, all views c +an be used to fetch any document if you already know the document's U +NID. my $view = '0'; my $unid = $doc->{UniversalID}; return LWP::Simple::get( "http://$server/$filepath/$view/$unid" ); }

$doc->{Body} or fetching of other RichTech items may yield truncated data

In Domino, ordinary text values aren't allowed to get longer than 64K. I recall there's some issues with stuff being stored in a double-byte encoding so you get somewhere around a 32K effective limit. RichText items are allowed to hold up to 4 GB by contrast. For version 4 and 5, there is no way via Win32::OLE to work around this. Use the C or C++ API to extract non-truncated data from RichText values in that case. I once wrote a C++ program for dumping non-truncated text from RichText fields. I will post the source for this later.

I will post the R6-ish way of doing this later.


In reply to Re: Extract Lotus Notes Mail to HTML by diotalevi
in thread Extract Lotus Notes Mail to HTML by Corion

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-03-29 04:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found