Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

glob() and dot files

by perlancar (Hermit)
on Apr 13, 2020 at 03:47 UTC ( #11115413=perlquestion: print w/replies, xml ) Need Help??

perlancar has asked for the wisdom of the Perl Monks concerning the following question:

A couple of questions on handling dotfiles with perl's glob(). Seems like bsd_glob() doesn't offer any special handling of dotfiles.

1. Is there an equivalent for "shopt -s dotglob"? Reading File::Glob indicates there isn't.

2. What would be the easiest to accomplish "listing all files/subdirectories including dotfiles/dotsubdirs but without the . and .." and "listing all dotfiles/dotsubdirs only, without the . and .."? I've given up on glob() for this and simply do something along the line of:

@all_files_including_dot = do { opendir my $dh, "."; grep { $_ ne '.' +&& $_ ne '..' } readdir $dh }; @all_dotfiles = do { opendir my $dh, "."; grep { $_ ne '.' && $_ ne '. +.' && /\A\./ } readdir $dh };

Replies are listed 'Best First'.
Re: glob() and dot files
by Fletch (Bishop) on Apr 13, 2020 at 04:29 UTC

    Path::Tiny has a children method which omits those, but personally I don't see anything wrong which what you've got offhand.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: glob() and dot files (updated)
by haukex (Archbishop) on Apr 13, 2020 at 08:19 UTC

    You might be interested in my node To glob or not to glob for the caveats of glob.

    What would be the easiest to accomplish "listing all files/subdirectories including dotfiles/dotsubdirs but without the . and .." and "listing all dotfiles/dotsubdirs only, without the . and .."?

    If you mean core-only, then:

    use File::Spec::Functions qw/ no_upwards catfile catdir /; opendir my $dh, $path or die "$path: $!"; my @files = map { -d catdir($path,$_) ? catdir($path,$_) : catfile($path,$_) } sort +no_upwards readdir $dh; closedir $dh;

    Note: I think that on some OSes (VMS?), there's a difference between catfile and catdir, that would require you to use the -d test, but I believe the above should work fine on any other OS. (Or, you can omit the catfile entirely if bare filenames are ok.) <update> Confirmed the difference between catfile and catdir with File::Spec::VMS and File::Spec::Mac, so I updated the above example with the -d test accordingly. </update> <update2> I don't have a VMS or Classic Mac to test on, but I realized that my update had a bug in that I wasn't doing the -d test on the full filename. So I hope that this updated version would really be correct on those platforms. </update2>

    If you need absolute pathnames, you probably want to add a $path = rel2abs($path); (also from File::Spec). Otherwise, if CPAN is fine, then I really like Path::Class, its children includes everything except the . and .. by default:

    use Path::Class; my @files = dir($path)->children(); # - or - my @files = dir($path)->absolute->children();

      Hi haukex,

      Thanks for pointing out about your glob() post. I think I read it in the past. If wildcard is problematic, this gives me an idea of creating a glob-like function but with regex instead: re_glob('.*') or re_glob(qr/\.foo/). It will not skip dotfiles by default.

      By the way, most of the time for practical reasons I don't bother with File::Spec at all, because why would I sacrifice myself using catfile() and no_upwards when I will not be using path separator other than "/", and parent directory other than ".." (probably for the rest of my life).

        By the way, most of the time for practical reasons I don't bother with File::Spec at all, because why would I sacrifice myself using catfile() and no_upwards when I will not be using path separator other than "/", and parent directory other than ".." (probably for the rest of my life).

        Well, if you know your scripts are only ever going to be run on *NIX, then sure. But what you're sacrificing is portability. For example, even nowadays, there are some Windows programs that can't handle / path separators and require \. Personally, although I've written code like "$path/$file" myself, I usually like my code to be as portable as possible, and if you're considering writing a re_glob(qr/\.foo/) function, you might want to release it as a module*, and then portability becomes important, IMHO.

        * use Path::Tiny; my @files = path($path)->children(qr/\.foo/); But sadly, Path::Tiny "does not try to work for anything except Unix-like and Win32 platforms." Alternative: use Path::Class; my @files = grep {$_->basename=~/\.foo/} dir(".")->children;

Re: glob() and dot files
by Marshall (Canon) on Apr 13, 2020 at 05:05 UTC
    This glob thing can be a problem. A long time ago I got tripped up with the 3 versions of glob that were in use at that time in the ActiveState version of Perl that I was using. I changed my code to use readdir() and that solved the problem.

    Nowadays, Perl glob is a lot more uniform and well behaved. This prints all simple files, but skips directories.

    my @files = glob ('*.*'); print "",join("\n",@files),"\n"
    For what you want, I would consider File::Find.
    Consider this code also.
    use strict; use warnings; opendir (my $dir, ".") or die "unable to open current directory $_"; my @files; my @directories; foreach my $file (grep{ ($_ ne '.') and ($_ ne '..')} readdir $dir) { if (-f $file ) {push @files, $file;} elsif (-d $file) {push @directories, $file;} else { die "weird beast found! $file"} } print "@files\n"; print "@directories\n";
    I think in Unix there can be special things that are not simple files or directories. I would use a file test to see what this name actually means.
    Note that if this is not the current directory, you need to spec the full path name for file tests.

    Update: File operations like "open file" or "open directory" are "expensive" in terms of performance. I would expect my code to run faster than the OP's code, but I did not benchmark this in any serious way. If the directories are small and this is not done that often, I don't think that will make any difference at all. Also be aware that there is a special variable for repeated file tests, "_". like  elsif (-d _) {do something{ That tests the structure returned by the previous file test operation for a different flag.

    Overall, unless there is a performance or other problem (special kinds of files), I see no problem with the OP's code.

      This prints all simple files, but skips directories.

      my @files = glob ('*.*');

      No, it prints any file or directory names that have a dot in them. It's a very old DOS convention that files had extensions and directories didn't, but nowadays that's not true anymore. You'd need grep {!-d} glob('*') to exclude directories.

        You are correct.
        my @files = glob('*'); # current directory print "",join("\n",@files), "\n";
        # Note that these file names do not say whether or not they are directories, a file test is needed. I demo'ed this at Re: Getting a list of directories matching a pattern.

        Update: I experimented with Windows 10 command line and found that I could indeed create a directory with a "dot suffix". That surprised me.Having said that, I have never seen such a thing in "real life". By convention, that is just not "the way that this is done". A long time ago, I was forced to use readdir and grep to get file names because of incompatible glob's. For production code, I still use readdir and grep because it will always work. For quick hacks, I am fine with glob().

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11115413]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2023-12-01 01:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?