unable to open input file

grashoper has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to generate a toc file, for helpfiles, unfortunately I am getting an error unable to open input file, hoping to get some help as to where the problem lies.

#!/usr/bin/perl -w
#
# To index an entire directory use: 
#     perl toc.pl *.html
#

use strict;
# holds the name of each file
# as it is being processed.
my($file);       

# holds the text of the heading
# (from the anchor tag).
my($heading);   
                
# holds the last heading level
# for comparision.
my($oldLevel);   
                
# holds each line of the file 
# as it is being processed.
my($line);      
                
# used as temporary variables 
# to shorten script line widths
my($match);     
my($href);      

# holds the name of the heading 
# from the anchor tag.
my($name);      
                
# holds the level of the current heading.
my($newLevel);  

# First, I open an output file and print the 
# beginning of the HTML that is needed.
#
$outputFile = "fulltoc.htm";
open(OUT, ">",$outputFile) or die "couldn't open outputfile";
print OUT ("<HTML><HEAD><TITLE>");
print OUT ("Detailed Table of Contents\n");
print OUT ("</TITLE></HEAD><BODY>\n");

# Now, loop through every file in the command 
# line looking for Header tags. When found, Look 
# for an Anchor tag so that the NAME attribute can 
# be used. The NAME attribute might be different
# from the actual heading.
#
foreach $file (sort(@ARGV)) {
    next if $file =~ m/^\.htm$/i;
    print("$file\n");
    open(INP,"+>", $file) or die "couldn't open input file";
    print OUT ("<UL>\n");
    $oldLevel = 1;
    while (<INP>) {
        if (m!(<H\d>.+?</H\d>)!i) {
            # remove anchors from header.
            $line = $1;
            $match = '<A NAME="(.+?)">(.+?)</A>';
            if ($line =~ m!$match!i) {
                $name = $1;
                $heading = $2;
            }
            else {
                $match = '<H\d>(.+?)</H\d>';
                $line =~ m!$match!i;
                $name = $1;
                $heading = $1;
            }
            m!<H(\d)>!;
            $newLevel = $1;
            if ($oldLevel > $newLevel) {
                print OUT ("</UL>\n");
            }
            if ($oldLevel < $newLevel) {
                print OUT ("<UL>\n");
            }
            $oldLevel = $newLevel;
            my($href) = "\"$file#$name\"";
            print OUT ("<LI>");
            print OUT ("<A HREF=$href>");
            print OUT ("$heading</A>\n");
        }
    }
    while ($oldLevel--) {
        print OUT ("</UL>\n");
    }
    close(INP);
}

# End the HTML document and close the output file.
#
print OUT ("</BODY></HTML>");
close(OUT);
[download]

Comment on unable to open input file Download Code

Replies are listed 'Best First'.
Re: unable to open input file by Fletch (Bishop) on Jan 12, 2009 at 15:02 UTC
Perhaps if you added $! to your error message and printed, erm, what the actual error is it might be enlightening . . . Tangentally: And you're possibly flirting with disaster not using an HTML parser rather than regexen but if this is a relatively static set of source documents you might can get away with it. The cake is a lie. The cake is a lie. The cake is a lie.	[reply]
Re: unable to open input file by toolic (Bishop) on Jan 12, 2009 at 15:03 UTC
Check to see if you are getting an error message by printing $!: `open(INP,"+>", $file) or die "couldn't open input file: $!";` [download] The usual suspects are that you do not have permissions to read a file, or that you are not looking in the directory you think you are looking in, or you made a typo in the file name on your command line.	[reply] [d/l]
Re: unable to open input file by meraxes (Friar) on Jan 12, 2009 at 15:06 UTC
From open: You can put a '+' in front of the '>' or '<' to indicate that you want both read and write access to the file; thus '+<' is almost always preferred for read/write updates--the '+>' mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable length records. I'm assuming `INP` is a text file so perhaps just `'<'` is a better file mode. update: added link to perlfunc and suggest you follow advice of toolic and Fletch -- meraxes	[reply] [d/l] [select]
Re: unable to open input file by hbm (Hermit) on Jan 12, 2009 at 15:24 UTC
I always find repeated "print OUT" a bit clunky. Two alternatives: `$outputFile = "fulltoc.htm"; open(OUT, ">",$outputFile) or die "couldn't open outputfile: $!"; select OUT; print ("<HTML><HEAD><TITLE>"); print ("Detailed Table of Contents\n"); print ("</TITLE></HEAD><BODY>\n");` [download] Or: `$outputFile = "fulltoc.htm"; open(OUT, ">",$outputFile) or die "couldn't open outputfile: $!"; print OUT "<HTML><HEAD><TITLE>" . "Detailed Table of Contents\n" . "</TITLE></HEAD><BODY>\n";` [download] Note that with the first approach, you'd have to explicitly print to STDOUT when you want to print to the screen: `print STDOUT "$file\n";` [download] I was also surprised by this line: `next if $file =~ m/^\.htm$/i; next unless $file =~ m/\.htm$/i; # not this?` [download]	[reply] [d/l] [select]
Re^2: unable to open input file by Perlbotics (Archbishop) on Jan 12, 2009 at 22:34 UTC
Right ... and a third one. There is a bit of debate whether heredocs (see quotelike operators: <<EOF)] are evil for this purpose, but it might be an option here for short scripts. However, if the project grows bigger, separation of content and format will become necessary ... (e.g. by using templates). `open(OUT, ">",$outputFile) or die "couldn't open outputfile: $!"; print OUT <<'END_HEADER'; # '...' no interpolation <HTML> <HEAD> <TITLE>Detailed Table of Contents</TITLE> </HEAD> END_HEADER print OUT <<"END_BODY"; # "..." with interpolation <BODY> <H1>File: $outputFile</H1> </BODY> </HTML> END_BODY` [download]	[reply] [d/l]


Think about Loose Coupling
	PerlMonks