Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: Batch file renaming - on identical name, keep only most recent file, based on dates

by Lotus1 (Vicar)
on Nov 06, 2019 at 04:55 UTC ( [id://11108347]=note: print w/replies, xml ) Need Help??

in reply to Batch file renaming - on identical name, keep only most recent file, based on dates

jcb beat me to the point about collisions from different users. (Edit: it's late here and I got things reversed in my head. No need for the reverse for the sort.) Here is a simple solution that works except it will clobber any collisions between users with the user with the lowest highest number. This solution takes a reverse sort of the filenames and then uses File::Copy::move to rename them so that the newest collision file is the last one to be renamed. If you specify what to do by user collisions this could be modified to watch for the user number.

Perl has a built in rename function but the documentation suggests using File::Copy::move since it is more portable across operating systems.

use warnings; use strict; use File::Copy; use File::Glob ':glob'; mkdir 'test' unless -d 'test'; #foreach( reverse sort glob( "*_Table_*.pdf" ) ){ foreach( sort glob( "*_Table_*.pdf" ) ){ print "$_\n"; my $newname = $_; $newname =~ s/^[0-9_]+//; print "--$newname\n"; ## using copy for testing. copy $_, "./test/$newname" or print "Error copying <$_> $!\n"; #move $_, "$newname" or print "Error renaming <$_> $!\n"; } __DATA__ test files: 8_2007_5_22_15_34_23_Table_-_2007522_XYZ_W3.pdf 8_2007_5_22_22_34_12_Table_-_2007522_XYZ_W3.pdf 7_2007_5_22_16_35_23_Table_-_2007522_XYZ_W3.pdf 7_2007_5_22_23_36_12_Table_-_2007522_XYZ_W3.pdf output file: Table_-_2007522_XYZ_W3.pdf

Edit: I forgot to add the program output. Also, I had an extra file in my test files.

8_2007_5_22_22_34_12_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 8_2007_5_22_15_34_23_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 7_2007_5_22_22_34_12_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 7_2007_5_22_15_34_23_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 7_2007_12_22_15_34_23_Table_-_20071222_XYZ_W3.pdf --Table_-_20071222_XYZ_W3.pdf

Edit: updated output without reverse sort.

7_2007_12_22_15_34_23_Table_-_20071222_XYZ_W3.pdf --Table_-_20071222_XYZ_W3.pdf 7_2007_5_22_15_34_23_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 7_2007_5_22_22_34_12_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 8_2007_5_22_15_34_23_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf 8_2007_5_22_22_34_12_Table_-_2007522_XYZ_W3.pdf --Table_-_2007522_XYZ_W3.pdf
  • Comment on Re: Batch file renaming - on identical name, keep only most recent file, based on dates
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Batch file renaming - on identical name, keep only most recent file, based on dates
by haukex (Archbishop) on Nov 06, 2019 at 08:13 UTC
    sort glob( "*_Table_*.pdf" )

    Unfortunately that doesn't work because it'll incorrectly sort a later datetime of e.g. 2007_10_22_15_34_23 before 2007_9_22_15_34_23.

      I thought of that and wish now I had mentioned it in my writeup. There is no chance of a name collision in that example since the year, month and day are repeated in the new name. For example:

      8_2007_10_22_15_34_23_Table_-_20071022_XYZ_W3.pdf 8_2007_9_22_15_34_23_Table_-_2007922_XYZ_W3.pdf

      The new filenames for these two will be:

      Table_-_20071022_XYZ_W3.pdf Table_-_2007922_XYZ_W3.pdf

      Having the sorting correct only matters among the filenames that start with the same YYYY?MDD dates. The HH_mm_ss format seems to be consistent and therefore sortable with the default sort. I'm assuming that from the limited example data we were given.

        The HH_mm_ss format seems to be consistent

        It seems not, given one of the datetimes shown in the OP is '2019-6-12T22:34:8'.

        There is no chance of a name collision in that example since the year, month and day are repeated in the new name.

        IMHO that's also a very dangerous assumption to make.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11108347]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-04-19 07:51 GMT
Find Nodes?
    Voting Booth?

    No recent polls found