Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

SaveAs in IE

by Ad Aspera (Acolyte)
on Jul 22, 2005 at 21:50 UTC ( [id://477361] : perlquestion . print w/replies, xml ) Need Help??

Ad Aspera has asked for the wisdom of the Perl Monks concerning the following question:

Perl Wizards,

I want to perform an unprompted SaveAs using OLE to save a web page.

In particular,

I want to save some files as
=> HTML - such as you would get with a "View Source" from IE
=> MHT - saving some files as a web archive such as you would get if doing a - File - Save As from IE's menu

The script below will produce the Save As HTML dialog box. It is 'prompted" and only offers an HTML or Text option.

This demonstrates what happens with => ExecWB

I get comparable results with
=> Execcmd('SaveAs' ....

I apparently don't understand the proper magic.

Please comment on what I've missed.

# ============================================= use Win32::OLE 'EVENTS'; $Win32::OLE::Warn = 3; # # command line argument => URL to get # usage: perl # (substitute your URL here to test) $URL = $ARGV[0]; my $IE = Win32::OLE->new("InternetExplorer.Application") || die "Could not start Internet Explorer\n"; $IE->{Addressbar} = 1; $IE->{Menubar} = 1; $IE->{Toolbar} = 1; $IE->{Statusbar} = 1; $IE->{Width} = 800; $IE->{Height} = 700; $IE->{Top} = 0; $IE->{Left} = 0; $IE->{Resizable} = 1; $IE->{visible} = 1; # Navigate to desired URL $IE->Navigate($URL); while ($IE->{Busy}) { Win32::Sleep 500; Win32::OLE->SpinMessageLoop(); } $Target = "g:\\zbuzz\\LeeMarlin.htm"; # # remove any prior file # unlink ( $Target ); # # Save the HTML file using ExecWB # my $OLECMDID_SAVEAS = '4'; my $OLECMDEXECOPT_DONTPROMPTUSER = '2'; # # ExecWB does pull up the Save As HTML dialog box, however, Don't Prompt User does not work # $IE->ExecWB($OLECMDID_SAVEAS, $OLECMDEXECOPT_DONTPROMPTUSER, $Target); # Quit IE $IE->Quit(); exit; # =============================================================== # Note: I've tried this alternative too, with similar results # # $IE = Navigate ($URL); # $IEDocument = $IE->{Document}; # $IEDocument->execCommand("SaveAs", "2", $Target ); # P.S., # Want to save some files as # HTML - such as you would get with a "View Source" from IE # MHT - saving some files as a web archive such as you would get if doing a # File # - Save As from IE's menu #

Replies are listed 'Best First'.
Re: SaveAs in IE
by CountZero (Bishop) on Jul 23, 2005 at 20:23 UTC
    Rather than using IE I would use LWP and its friends to get the HTML-code of the page you want to catch.

    Then you walk this HTML-file (with some suitable HTML-parser) and download (again with LWP) all files which are linked to your page, at the same time adapting these links so they now point to the local copies of the files you just downloaded.


    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: SaveAs in IE
by silent11 (Vicar) on Jul 24, 2005 at 06:10 UTC
Re: SaveAs in IE
by puploki (Hermit) on Jul 24, 2005 at 20:43 UTC
    Again, it's not really answering the question directly (I've never got on with Win32::OLE I have to admit), but I think my favouritest module is HTTP::Recorder - it works as a sort web proxy that you point IE at, and then it creates some scripts you can play back later. Ideal for:
    • Web/URL monitoring applications (time of response/availability etc)
    • Mass configuration of networking equipment that only has a web interface (ADSL routers etc.)
    • Crazy data input via a web form
    • Crawling web sites
    There's a fabby tutorial up on
Re: SaveAs in IE
by clscott (Friar) on Jul 25, 2005 at 02:26 UTC

    I also can't answer you question but I have one to pose:

    Have you heard of SAMIE?

    From the site

    Simple Automation Module For Internet Explorer
    Samie allows you to write perl scripts in order to drive Internet Explorer all over the web while you help your wife do the dishes.

    Simply put, samie lets you write scripts to test exactly how Internet Explorer displays your company information to the world. He will click on links, buttons, menus, check and listboxes. He can fill in edit boxes with information from a database and verify the accuracy of what the web server returns. He can log all results to a database or a flat text file. He can post those results to a company web page.