Re: Sorting files by 3 numbers in the name
by tobyink (Canon) on May 26, 2017 at 14:09 UTC
|
# These constants make the code below more readable.
#
use constant {
IX_FILENAME => 0,
IX_RUN => 1,
IX_DISTRICT => 2,
IX_COPY => 3,
IX_TOTAL => 4,
};
# Read this bit from bottom to top:
#
my @sorted =
# Now we've sorted our arrayrefs by the fields we're interested in
# we loop through them again, pulling out just the filename and
# discarding the other parts.
map {
$_->[IX_FILENAME]
}
# Sort by the fields we're interested in. Note that if the two
# values for RUN are different, this will sort by them, and everyt
+hing
# following the first 'or' is ignored. If they're the same, that
# comparison returns 0, so the stuff after 'or' isn't ignored,
# and we compare by DISTRICT, then COPY, then TOTAL.
sort {
$a->[IX_RUN] <=> $b->[IX_RUN] or
$a->[IX_DISTRICT] <=> $b->[IX_DISTRICT] or
$a->[IX_COPY] <=> $b->[IX_COPY] or
$a->[IX_TOTAL] <=> $b->[IX_TOTAL]
}
# For each filename, split it into an arrayref, so that the first
# element in the arrayref is the filename itself, and the rest are
# the fields we're interested in.
map {
[ $_, m/\A[A-Z0-9]+_([0-9])+_ETSTexas_.*_Candidate_RRD_([0-9]+
+)_([0-9]{2})_([0-9]{2})/i ]
}
# Take our list of filenames…
@files;
# Check it works. (It does.)
#
print Dumper(\@sorted);
| [reply] [d/l] |
|
| [reply] |
|
Yeah, but I think they were added in 5.10, and when possible I try to give examples using 5.8 features. (Something like say is excusable, because it's so easy to write a shim for it.
sub say { local $\ = "\n"; print(@_ or $_) }
sub IO::Handle::say { my $h = shift; local $\ = "\n"; $h->print(@_ or
+$_) }
I also quite like this way:
use constant {
IX_FILENAME => 0,
IX_RUN => 2,
IX_DISTRICT => 8,
IX_COPY => 9,
IX_TOTAL => 10,
};
print Dumper
map {
Dumper($_), $_->[IX_FILENAME]
}
sort {
$a->[IX_RUN] <=> $b->[IX_RUN] or
$a->[IX_DISTRICT] <=> $b->[IX_DISTRICT] or
$a->[IX_COPY] <=> $b->[IX_COPY] or
$a->[IX_TOTAL] <=> $b->[IX_TOTAL]
}
map {
[ $_, split /_/ ]
}
@files;
| [reply] [d/l] [select] |
|
Re: Sorting files by 3 numbers in the name
by tybalt89 (Monsignor) on May 26, 2017 at 14:46 UTC
|
Since perl's sort is now stable,
I offer this in loving memory and tribute to IBM card sorters :)
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1191282
use strict;
use warnings;
use Data::Dumper;
my @files = qw(
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_01_0
+2_Spr17_Initial_201705040952_41044.zip
ASR0004520_8960_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_04_0
+4_Spr17_Initial_201705040952_41045.zip
ASR0004994_8958_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_02_0
+4_Spr17_Initial_201705040951_41043.zip
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_02_0
+2_Spr17_Initial_201705040952_41044.zip
ASR0005154_8957_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_01_0
+4_Spr17_Initial_201705040951_41042.zip
ASR0005336_8959_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_03_0
+4_Spr17_Initial_201705040952_41044.zip
ASR0005336_8972_ETSTexas_EOC052017P_0517_Candidate_RRD_178902_01_0
+1_Spr17_Initial_201705040952_41044.zip
);
# sort by pseudo-column with stable sort -- IBM card sorters forever !
+!!
my @returnfiles =
sort { (split /_/, $a)[1] <=> (split /_/, $b)[1] }
sort { (split /_/, $a)[7] <=> (split /_/, $b)[7] }
sort { (split /_/, $a)[8] <=> (split /_/, $b)[8] }
sort { (split /_/, $a)[9] <=> (split /_/, $b)[9] }
@files;
print Dumper \@returnfiles;
| [reply] [d/l] |
|
| [reply] |
|
| [reply] |
|
| [reply] |
Re: Sorting files by 3 numbers in the name
by BrowserUk (Patriarch) on May 26, 2017 at 13:58 UTC
|
#! perl -slw
use strict;
my @files = qw(
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_01_0
+2_Spr17_Initial_201705040952_41044.zip
ASR0004520_8960_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_04_0
+4_Spr17_Initial_201705040952_41045.zip
ASR0004994_8958_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_02_0
+4_Spr17_Initial_201705040951_41043.zip
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_02_0
+2_Spr17_Initial_201705040952_41044.zip
ASR0005154_8957_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_01_0
+4_Spr17_Initial_201705040951_41042.zip
ASR0005336_8959_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_03_0
+4_Spr17_Initial_201705040952_41044.zip
ASR0005336_8972_ETSTexas_EOC052017P_0517_Candidate_RRD_178902_01_0
+1_Spr17_Initial_201705040952_41044.zip
);
print for map unpack( 'x16 a*', $_ ), sort map pack( 'NNNNa*', (m[_(\d
++)]g)[0,2,3,4], $_ ), @files;
__END__
[14:56:37.78] C:\test>junk39
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_01_02_Sp
+r17_Initial_201705040952_41044.zip
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_02_02_Sp
+r17_Initial_201705040952_41044.zip
ASR0005154_8957_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_01_04_Sp
+r17_Initial_201705040951_41042.zip
ASR0004994_8958_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_02_04_Sp
+r17_Initial_201705040951_41043.zip
ASR0005336_8959_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_03_04_Sp
+r17_Initial_201705040952_41044.zip
ASR0004520_8960_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_04_04_Sp
+r17_Initial_201705040952_41045.zip
ASR0005336_8972_ETSTexas_EOC052017P_0517_Candidate_RRD_178902_01_01_Sp
+r17_Initial_201705040952_41044.zip
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
In the absence of evidence, opinion is indistinguishable from prejudice.
Suck that fhit
| [reply] [d/l] |
|
| [reply] [d/l] |
Re: Sorting files by 3 numbers in the name
by hippo (Bishop) on May 26, 2017 at 13:47 UTC
|
# inefficiently sort by descending numeric compare using
# the first integer after the first = sign, or the
# whole record case-insensitively otherwise
my @new = sort {
($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0]
||
fc($a) cmp fc($b)
} @old;
# same thing, but much more efficiently;
# we'll build auxiliary indices instead
# for speed
my (@nums, @caps);
for (@old) {
push @nums, ( /=(\d+)/ ? $1 : undef );
push @caps, fc($_);
}
my @new = @old[ sort {
$nums[$b] <=> $nums[$a]
||
$caps[$a] cmp $caps[$b]
} 0..$#old
];
# same thing, but without any temps
my @new = map { $_->[0] }
sort { $b->[1] <=> $a->[1]
||
$a->[2] cmp $b->[2]
} map { [$_, /=(\d+)/, fc($_)] } @old;
| [reply] [d/l] |
Re: Sorting files by 3 numbers in the name
by BillKSmith (Monsignor) on May 26, 2017 at 18:57 UTC
|
I have no idea how fast this module is, but you cannot beat it for convenience.
use strict;
use warnings;
use List::UtilsBy qw(sort_by);
my $x;
my @files = qw(
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_01_0
+2_Spr17_Initial_201705040952_41044.zip
ASR0004520_8960_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_04_0
+4_Spr17_Initial_201705040952_41045.zip
ASR0004994_8958_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_02_0
+4_Spr17_Initial_201705040951_41043.zip
ASR0005336_8950_ETSTexas_EOC052017P_0517_Candidate_RRD_178904_02_0
+2_Spr17_Initial_201705040952_41044.zip
ASR0005154_8957_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_01_0
+4_Spr17_Initial_201705040951_41042.zip
ASR0005336_8959_ETSTexas_EOC052017P_0517_Candidate_RRD_178901_03_0
+4_Spr17_Initial_201705040952_41044.zip
ASR0005336_8972_ETSTexas_EOC052017P_0517_Candidate_RRD_178902_01_0
+1_Spr17_Initial_201705040952_41044.zip
);
my @sorted_files = sort_by { join( '', (split /_/, $_)[1,7,8,9]) } @fi
+les;
$, = "\n";
print @sorted_files;
| [reply] [d/l] |
|
| [reply] |
|
We are comparing fixed length strings of digits. The result is the same whether we compare them lexically or numerically. I chose the lexical sort because the strings do not seem to have any numerical significance.
Thanks for supplying the link to the module documentation.
UPDATE:
Oops! My comment about no numerical significance is wrong. My comment on the subject in level 6 below applies here as well (as long as all fields are of fixed length). I still prefer the lexical sort, but it is harder to justify.
| [reply] |
|
|
|
|
Re: Sorting files by 3 numbers in the name
by thanos1983 (Parson) on May 26, 2017 at 14:07 UTC
|
my @entries;
...
# Loop over your files and extract data
push @files, {
'array_position' => $array_position,
'run' => $run,
'copy' => $copy,
'district' => $district,
'total' => $total
};
...
my @sorted_files = sort {
$a->{'run'} <=> $b->{'run'} || # use '<=>' for numbers
$a->{'copy'} <=> $b->{'copy'} ||
$a->{'district'} <=> $b->{'district'} ||
$a->{'total'} <=> $b->{'total'}
} @files;
This is a sample but you get the picture, extract the values of each file and then based on the values sort them.
Update: Adding array_position.
Hope this helps.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
Re: Sorting files by 3 numbers in the name
by crusty_collins (Friar) on May 26, 2017 at 14:26 UTC
|
Thank you all so much! now it makes sense to me
"We can't all be happy, we can't all be rich, we can't all be lucky – and it would be so much less fun if we were. There must be the dark background to show up the bright colours."
Jean Rhys (1890-1979)
| [reply] |