Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Easy way to check if a file is open needed.

by Apt_Addiction (Novice)
on Apr 08, 2019 at 17:48 UTC ( [id://1232306]=perlquestion: print w/replies, xml ) Need Help??

Apt_Addiction has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a script to run on macOS that will rename all illegal files in a tree, so they can go to OneDrive and Dropbox without jamming the processes.

I'm new to Perl and only a beginner hobby coder, dipping my toes into Perl, but I've learned a lot and asked a lot and have come up with this, which does everything but two things - check if a file is already open before renaming it. and avoid dot files.

I'm aware of lsof but I'm wondering if there's an easier and quicker way to check if any process in the system has a handle on a file open?

use warnings; use File::Find; use 5.010; use POSIX; find(\&dirRecurs, '/users/my-dir/tree-to-work-on'); sub dirRecurs{ if ((my $txt = $_) =~ s/^\ | (?=\.)|[\/!#%^:<>?*&()\\]| $//g & +& -f != m/^\./) { my $filename = '/users/my-dir/fix_names_report.txt'; open(my $fh, '>>', $filename); if (! -e $txt){ rename($_, $txt); say $fh (strftime "%Y-%m-%d %H:%M:%S", localtime t +ime)," \: ", $_, " => ", $txt; } else { say $fh ("!!! ", strftime "%Y-%m-%d %H:%M:%S", loc +altime time)," \: ", $_, " => NOT RENAMED =>", $txt, " already exists +."; } close $fh; } }

Any advice or suggestions?

Replies are listed 'Best First'.
Re: Easy way to check if a file is open needed.
by haukex (Archbishop) on Apr 08, 2019 at 19:09 UTC

    Welcome to Perl and the Monastery, Apt_Addiction!

    AFAIK, you should be able to rename files even if they're open on Mac OS X, although if you're going to do a mass rename in a directory tree, you probably don't want to be doing any other operations on that tree in the meantime, and therefore exit any programs that might be using it. What problems are you worried about here?

    As for the code, a couple of notes: You don't check the return values of rename or open for errors, as in rename($_, $txt) or die "rename('$_','$txt'): $!"; and open(my $fh, '>>', $filename) or die "$filename: $!";. And if you're worried that other processes might be writing to the directory while you're doing your rename, note your code has a race condition: It's possible the file with the name $txt might be created in between the -e $txt and rename($_, $txt) (hence my above recommendation to try and ensure that your script is the only one working on the directory).

    As for the dot files, I assume that's what you're trying to do with -f != m/^\./? Note that -f only checks if $_ is a regular file (or is a symbolic link that points to a regular file) and returns a true/false value, so doing an != on its return value doesn't really make sense. I guess you might have meant $_ !~ m/^\./, which evaluates to true if $_ doesn't match the regex - that's written more briefly as !/^\./.

    Another improvement might be to not open your log file on every rename, but only open it once at the start of the program, that would be more efficient.

    In general, you should Use strict and warnings, and you might be interested in Corion's Text::CleanFragment, which does a thorough cleanup of strings, although it might do too much for your purposes.

Re: Easy way to check if a file is open needed.
by choroba (Cardinal) on Apr 08, 2019 at 19:07 UTC
    Why do you need it? Most processes don't care about the name of a file they've already opened. The filehandle stays connected to it even if the file name changes:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; open my $OUT, '>', '1' or die $!; my $child = fork; if ($child) { sleep 1; my %new_name = map +($_ => $_ + 1), 1 .. 8; for my $old (1 .. 8) { rename $old, $new_name{$old} or die $!; say "$old renamed to $new_name{$old}."; sleep 1; } exit } defined $child or die "Can't fork"; for (1 .. 10) { say {$OUT} $_; say "Printed: $_."; sleep 1; }

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Hello choroba,

      when I read

      > The filehandle stays connected to it even if the file name changes

      I said: oh what good news! But when I tried it appears to behaves differently on MSWin32: I suddendly get Permission denied at writetorenamed.pl line 13. after had printed 1 or 2 and the file get not renamed ;(

      perl writetorenamed.pl Printed: 1. Printed: 2. Permission denied at writetorenamed.pl line 13. Printed: 3. Printed: 4. Printed: 5. Printed: 6. Printed: 7. Printed: 8. Printed: 9. Printed: 10. cat 1 1 2 3 4 5 6 7 8 9 10

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
        Yes, my comment is valid only for *nix and macOS.
        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      Thank you for your reply. It's taken me a while to reply to this because I can't work out what a fair bit of it's doing!

      I've not come across map before, so I'll have a look at this today and learn.

      Why do I need it? It's just so that a file that's open in MS Word, for example, doesn't get renamed.

Re: Easy way to check if a file is open needed.
by Apt_Addiction (Novice) on Apr 08, 2019 at 20:01 UTC

    Thanks for your replies.

    Sorry I wasn't clear about the purpose. It's just a hobby and I'm trying to learn and solve the puzzle. There's probably alsorts of bad in this, but here goes...

    Wife has a USB stick with her work on, and she gets files from co-workers. They need backing up to the cloud in case of loss/forgeting stick. So, I have rsync running on a crontab that copies her USB stick contents into OneDrive. All works well until someone puts an illegal filename into the mix. I don't get to use her computer much 'cos she's nearly always using it. So I want to run a script as a crontab to clean up any issues and stop things from jamming up.

    If she's got a file open in MS Word, I don't want it renamed, like it does at present. So I'm wanting to check for file handles in the tree whilst it's being used. When she closes the file, the script will run again at some point and take care of the issues.

    Thanks again, and I hope that makes it a bit clearer.

      I have rsync running on a crontab that copies her USB stick contents into OneDrive.

      By this, do you mean that rsync is basically just doing a local copy from the USB stick to whatever local folder OneDrive is syncing to the cloud? If you don't want to do the rename directly on the USB stick: I'm not sure whether OneDrive would tolerate this, but perhaps you could do the rename directly inside the OneDrive directory immediately after the rsync? Another option might be an intermediate staging area, i.e. USB stick --rsync--> staging area, do rename here --rsync--> OneDrive folder; although that's of course less efficient, if it's not much data, it may be fine.

      If you do want to do the rename on the USB stick: At the moment I'm not aware of any better method of checking if specific files are open than lsof (or similar tools such as fuser), which is probably not particularly efficient.

        Thank you for that - really helpful. Yes, rsync just updates OneDrive from USB every 10 minutes or so. I'd got so wrapped up in testing this outside of OneDrive, that I hadn't thought of some of the things you mention. Looking at my OneDrive folder with lsof, OD has handles on files that aren't open anywhere else. So, your idea of renaming on the USB stick is best. Plus, she has quite a lot of data.

        In terms of the time and things happening whilst the script runs, I have 1,200 files in my OD and this script found and renamed about 40 in 0.12 seconds, which seems OK. The time hog is lsof, which takes about 13 seconds to run. This seems ridiculous compared to linux, but apparently it's a mac thing.

        Anyway, I'll create a temp file with the lsof output for the tree, then grep for the current filename before proceeding further. I understand grep will return 0 if it finds something.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1232306]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-25 23:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found