http://qs321.pair.com?node_id=11108382


in reply to Re^2: Batch file renaming - on identical name, keep only most recent file, based on dates
in thread Batch file renaming - on identical name, keep only most recent file, based on dates

I thought of that and wish now I had mentioned it in my writeup. There is no chance of a name collision in that example since the year, month and day are repeated in the new name. For example:

8_2007_10_22_15_34_23_Table_-_20071022_XYZ_W3.pdf 8_2007_9_22_15_34_23_Table_-_2007922_XYZ_W3.pdf

The new filenames for these two will be:

Table_-_20071022_XYZ_W3.pdf Table_-_2007922_XYZ_W3.pdf

Having the sorting correct only matters among the filenames that start with the same YYYY?MDD dates. The HH_mm_ss format seems to be consistent and therefore sortable with the default sort. I'm assuming that from the limited example data we were given.

  • Comment on Re^3: Batch file renaming - on identical name, keep only most recent file, based on dates
  • Select or Download Code

Replies are listed 'Best First'.
Re^4: Batch file renaming - on identical name, keep only most recent file, based on dates
by haukex (Bishop) on Nov 06, 2019 at 16:39 UTC
    The HH_mm_ss format seems to be consistent

    It seems not, given one of the datetimes shown in the OP is '2019-6-12T22:34:8'.

    There is no chance of a name collision in that example since the year, month and day are repeated in the new name.

    IMHO that's also a very dangerous assumption to make.

      I didn't notice that detail. Those dates in the example code seem to be something that was just typed in by the OP whereas the filenames seemed to be pasted in so I paid more attention to the filenames. My comment about HH_mm_ss was referring to the actual filenames. If the OP were to provide an example that shows a format such as ?H_?m_?s then I would add a quick fix-it function to normalize those into a sane format and then sort. I still think my approach is a good one, especially for new Perl programmers. Some of my work needs to be supported by folks with limited programming experience or by people with limited Perl knowledge.

      IMHO that's also a very dangerous assumption to make.
      I'm just doing this for fun and it's the OP's responsibility to check his work. The OP didn't provide much to go by. In my own work I do a lot of validation with logging so that files that don't meet my expectations are skipped. I start with a solution that works for a small data set and then run it on the full set of files and inspect the log.

      Update: Here is a version that pads the time stamps with zeros so they sort correctly.

      use warnings; use strict; #use File::Copy; foreach( sort map { chomp;s/_(\d)_/_0$1_/g;s/_(\d)_/_0$1_/g;$_ } grep +{/^\d+_\d\d\d\d_\d+_\d+_\d+_\d+_\d+_Table/} <DATA> ){ my $newname = $_; $newname =~ s/^[0-9_]+//; printf "%38s%38s\n",$_,$newname; ## rename here. #move $_, $newname or print "Error renaming <$_> $!\n"; } __DATA__ Test file names 8_2007_5_22_15_34_23_Table_-_2007522_XYZ_W3.pdf 8_2007_5_22_22_34_12_Table_-_2007522_XYZ_W3.pdf 8_2007_5_2_15_34_23_Table_-_200752_XYZ_W3.pdf 8_2007_5_2_22_34_12_Table_-_200752_XYZ_W3.pdf 8_2007_5_2_5_34_23_Table_-_200752_XYZ_W3.pdf 8_2007_5_2_2_34_12_Table_-_200752_XYZ_W3.pdf 7_2007_5_22_16_35_23_Table_-_2007522_XYZ_W3.pdf 7_2007_5_22_23_36_12_Table_-_2007522_XYZ_W3.pdf blanks and other things 0 8_3007 &#8993;&#9675;{}&#9562;

      The output looks like this:

      7_2007_05_22_16_35_23_Table_-_2007522_XYZ_W3.pdf Table_-_20 +07522_XYZ_W3.pdf 7_2007_05_22_23_36_12_Table_-_2007522_XYZ_W3.pdf Table_-_20 +07522_XYZ_W3.pdf 8_2007_05_02_02_34_12_Table_-_200752_XYZ_W3.pdf Table_-_20 +0752_XYZ_W3.pdf 8_2007_05_02_05_34_23_Table_-_200752_XYZ_W3.pdf Table_-_20 +0752_XYZ_W3.pdf 8_2007_05_02_15_34_23_Table_-_200752_XYZ_W3.pdf Table_-_20 +0752_XYZ_W3.pdf 8_2007_05_02_22_34_12_Table_-_200752_XYZ_W3.pdf Table_-_20 +0752_XYZ_W3.pdf 8_2007_05_22_15_34_23_Table_-_2007522_XYZ_W3.pdf Table_-_20 +07522_XYZ_W3.pdf 8_2007_05_22_22_34_12_Table_-_2007522_XYZ_W3.pdf Table_-_20 +07522_XYZ_W3.pdf

        I'll just say what I've said several times before to others: Instead of making assumptions (including the assumption that the wisdom seeker knows to rigorously test code) and throwing code out there that will (silently!) break when the assumptions aren't met, there are several alternatives, IMHO better ones:

        • Clearly documenting the assumptions.
        • Coding defensively, i.e. the code dies if the assumptions aren't met.
        • Applying Postel's law and accepting a wider range of inputs. (E.g. accept single digits in every field of the datetime.)
        • Asking the wisdom seeker for clarification.