Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Usage of File Handles

by harangzsolt33 (Chaplain)
on Feb 07, 2019 at 22:13 UTC ( [id://1229570]=note: print w/replies, xml ) Need Help??


in reply to Usage of File Handles

What do you think about this? I like to enclose my file I/O functions in subs, so I don't even have to deal with opening files and stuff like that. Call me lazy. LOL

#!/usr/bin/perl -w use strict; use warnings; ######################################## my $INPUT_FILE_NAME; my $OUTPUT_FILE_NAME; Init(); my @LINES = ReadTextFile($INPUT_FILE_NAME); # DO SOMETHING CreateFile($OUTPUT_FILE_NAME, join("\n", @LINES)) or die("Can't write to file...\n\n"); print "SUCCESS!\n\n"; exit; ######################################## sub Init { my $SELF = GetFileName($0); $INPUT_FILE_NAME = (@ARGV) ? $ARGV[0] : ''; $INPUT_FILE_NAME or die("\nUsage: $SELF <filename>\n\n"); $OUTPUT_FILE_NAME = $INPUT_FILE_NAME . '.out'; } # Usage: STRING = Trim(STRING) - Removes whitespace, newline character +s and other special characters before and after STRING. Returns a new + string. sub Trim { @_ or return ''; my $T = $_[0]; defined $T or return ''; my + $N = length($T); my $X = 0; my $Y = 0; while ($N--) { if (vec($T, $N +, 8) > 32) { $X = $N; $Y or $Y = $N + 1; } } return substr($T, $X, $Y + - $X); } # Usage: FILE_NAME_ONLY = GetFileName(FULL_NAME) - Returns only the na +me portion of a full file name. sub GetFileName { @_ or return ''; my $W = shift; defined $W or return + ''; length($W) or return ''; $W =~ tr|\\|/|; return substr($W, rinde +x($W, '/') + 1, length($W)); } # Usage: STRING = _FileName(\@_) - Removes the first argument from @_ +just like shift() does and returns a file name. This function does no +t check syntax, but it does remove some illegal characters (<>|*?) fr +om the name that obviously should not occur in a file name. If the fi +le name doesn't contain any valid characters, then returns an empty s +tring. sub _FileName { @_ or return ''; my $N = shift; $N = shift(@$N); defin +ed $N or return ''; length($N) or return ''; my $c; my $j = 0; my $V += 0; for (my $i = 0; $i < length($N); $i++) { $c = vec($N, $i, 8); ne +xt if ($c == 63 || $c == 42 || $c < 32); last if ($c == 60 || $c == 6 +2 || $c == 124); if ($c > 32) { $V = $j + 1; } if ($V) { $i == $j or +vec($N, $j, 8) = $c; $j++; } } return substr($N, 0, $V); } # Usage: ARRAY = ReadTextFile(FILE_NAME, [LIMIT]) - Reads the contents + of a text file and returns the lines in an array. If a second argume +nt is provided, then only the first few lines will be processed. Each + line is trimmed before it is stored. sub ReadTextFile { my @A; my $F = _FileName(\@_); length($F) or return + @A; my $M = @_ ? shift : 99999999; defined $M or return @A; $M or re +turn @A; -f $F or return @A; -s $F or return @A; my $H; my $B; my $i += 0; open $H, "<$F" or return @A; while (my $L = <$H>) { $A[$i++] = T +rim($L); $i < $M or last; } close $H; return @A; } # Usage: STATUS = CreateFile(FILE_NAME, STRING) - Creates and overwrit +es a file. Returns 1 on success or 0 if something went wrong. sub CreateFile { my $F = _FileName(\@_); length($F) or return 0; my $S + = (@_) ? shift : ''; return 0 unless defined $S; open(my $H, ">$F") +or return 0; if (length($S)) { print $H $S or return 0; } close $H or + return 0; return 1; }

Replies are listed 'Best First'.
Re^2: Usage of File Handles
by haukex (Archbishop) on Feb 08, 2019 at 08:53 UTC
    What do you think about this?

    You're of course free to write your code in any style you like, but I do have to say it's not something I would recommend for a beginner.

    it does remove some illegal characters (<>|*?) from the name that obviously should not occur in a file name

    Those are all perfectly valid characters in many *NIX OSes, see e.g. this. I also don't understand why some of those characters are simply removed and others cause the string to be cut off at that point.

    my $M = @_ ? shift : 99999999;

    This causes a somewhat arbitrary silent cutoff at this many lines. In general, in that code there are lots of errors that are silently swallowed.

    In general, your use of vec for string operations is not a good idea for Unicode strings (in fact, it will become a fatal error in Perl 5.32). If you need to treat a string as a sequence of characters, you could either split //, $str or use substr, although normally regular expressions can handle many of the cases where one would need to do so in other languages.

    Plus, there are lots of other stylistic choices that I would not recommend to a newcomer: Reinvented wheels (GetFileName instead of File::Basename or File::Spec, GetFileName($0) instead of $FindBin::Script, Trim instead of e.g. s/^[\0-\x20]+|[\0-\x20]+$//g), two-argument instead of three-argument open, uppercase variable names for non-constant variables, obfuscation by using single-letter variable names and packing function bodies on one line, unused variables...

    Sorry for the long critique, but as I said this is in the context of giving code to an apparent beginner.

      it does remove some illegal characters (<>|*?) from the name that obviously should not occur in a file name

      Those are all perfectly valid characters in many *NIX OSes, see e.g. this. I also don't understand why some of those characters are simply removed and others cause the string to be cut off at that point.

      Okay, I was told earlier that when we open a file such as open FILEHANDLE, "< $FILE_NAME" then it's a good idea to make sure that $FILE_NAME does not contain any special characters such as | > < because it's a potential vulnerability, especially if you get your file name from some other place like arguments. Your script could be hacked, and it may end up doing something you didn't want.. That's why I check the file name.

      Also, there is no point in doing this : open FILEHANDLE, "< *.*" so again those special characters should not appear in that space. It's perfectly okay to include them when you do a search, but not when you're trying to open a file for reading.

        it's a good idea to make sure that $FILE_NAME does not contain any special characters such as | > < because it's a potential vulnerability, especially if you get your file name from some other place like arguments

        Yes, this is true - if you're using the two-argument instead of three-argument open. You said you're using Perl 5.8, where the latter is available. This is another reason that the more modern three-argument open and lexical filehandles are recommended. Also, I think that silently deleting characters or chopping off the filename at these characters, which will result in attempting to open a completely different file, is unexpected behavior - IMO it's much better to simply throw an error and refuse to open such a file and let the user figure it out, instead of taking some action that isn't what the user asked for.

        open FILEHANDLE, "< *.*" so again those special characters should not appear in that space

        No, as I said, '*.*' is a valid filename - strange and unusual, but valid. And again, why silently try to open a file named '.' instead?

        The potential vulnerability only happens if you do not use the three-argument version of open. Maybe you should upgrade your Perl knowledge a bit.

Re^2: Usage of File Handles
by Laurent_R (Canon) on Feb 07, 2019 at 22:53 UTC
    Call me lazy.
    It appears that your personal flavor of laziness requires quite a lot of work.
      Yes...LOL!!!!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1229570]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-26 00:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found