cherio has asked for the wisdom of the Perl Monks concerning the following question:
I have file path that has one or more directories and like this "/abc/def/ghi". I need a regex that can break it into an array as follows
( /abc/def/ghi, /abc/def, /abc )
alternatively a reversed array will do as well
( /abc, /abc/def, /abc/def/ghi )
The following regex breaks path into up to 3 segments
my $path = "/abc/def/ghi";
my @segments = ($path =~ m'^(/[^/]+ (/[^/]+ (/[^/]+)? )? )'x);
This regex is rigid. I can make it long enough to be able to break path into N segments but if the actual path has N+1 segments the regex won't work.
I want a regex that can break path into variable number of segments and not hardcode maximum path length into it.
Re: Breaking file path into segments
by choroba (Cardinal) on Apr 12, 2019 at 20:23 UTC
|
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use Path::Tiny;
my $path = '/abc/def/ghi';
my $climb = path($path);
my @paths = $climb;
push @paths, $climb until ($climb = $climb->parent) eq '/';
say for @paths;
File::Spec is a core module that can handle the path as well:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use File::Spec;
my $path = '/abc/def/ghi';
my @paths = 'File::Spec'->splitdir($path);
for my $i (reverse 0 .. $#paths) {
$paths[$i] = join '/', @paths[0 .. $i];
}
shift @paths; # Remove the empty path.
say for @paths;
But a regex with split will work similarly:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use Path::Tiny;
my $path = '/abc/def/ghi';
my @paths = split m{(?<=.)/}, $path;
for my $i (reverse 0 .. $#paths) {
$paths[$i] = join '/', @paths[0 .. $i];
}
say for @paths;
I used a lookbehind assertion to skip the empty path before the first slash.
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] [select] |
|
File::Spec is a standard module and comes with Perl. For a list of standard modules, see perldoc perlmodlib
If your code is going to be run on a different machine than your development one, using standard modules means you don't have to distribute and maintain additional code (the modules).
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
Re: Breaking file path into segments
by holli (Abbot) on Apr 12, 2019 at 20:29 UTC
|
use Path::Class;
print join "#", dir("abc/def/ghi")->components;
holli
You can lead your users to water, but alas, you cannot drown them.
| [reply] [d/l] |
Re: Breaking file path into segments (updated x2)
by AnomalousMonk (Archbishop) on Apr 12, 2019 at 21:38 UTC
|
I agree with others that a path-processing module is the way to go here, but if you just gotta have a regex, here's another:
c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le
"print qq{perl version: $]};
;;
my $path = '/abc/def/ghi';
;;
local our @dirs;
$path =~ m{ (\A (?: / [^/]+ \b)+) (?{ push @dirs, $^N }) (?!) }xms;
;;
dd \@dirs;
"
perl version: 5.008009
["/abc/def/ghi", "/abc/def", "/abc"]
With Perl version 5.10+, the odd-looking (?!) can become (*FAIL) from Special Backtracking Control Verbs in perlre (but the compiler will optimize it to (*FAIL) anyway). With version 5.18+ (IIRC), the package-global array @dirs can become a my array; that bug was fixed.
Update 1: On second thought, the \b anchor is a bit too alnum-specific: make it (?! [^/]) instead:
$path =~ m{ (\A (?: / [^/]+ (?! [^/]))+) (?{ push @dirs, $^N }) (?!) }xms;
Update 2: And if you want to reverse the order of the pieces, use a lazy quantifier:
c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le
"print qq{perl version: $]};
;;
my $path = '/=abc/def/ghi=/---';
;;
local our @dirs;
$path =~ m{ (\A (?: / [^/]+ (?! [^/]))+?) (?{ push @dirs, $^N }) (?!)
+ }xms;
;;
dd \@dirs;
"
perl version: 5.008009
["/=abc", "/=abc/def", "/=abc/def/ghi=", "/=abc/def/ghi=/---"]
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
Re: Breaking file path into segments
by johngg (Canon) on Apr 13, 2019 at 11:01 UTC
|
An alternative to doing the whole thing with a regex would be to split into individual path elements then push joined elements onto the array, popping elements off the end until there's nothing left.
use 5.026;
use warnings;
use Data::Dumper;
my @paths = qw{
/abc/def/ghi
/wxy/z
/usr/local/lib/x86_64-linux-gnu/perl/5.26.1
bin/fred
somefile
};
foreach my $path ( @paths )
{
my @elems = split m{/}, $path;
my @arr;
while ( @elems )
{
last if @elems == 1 && ! $elems[ 0 ]; # Ignore empty first ele
+ment
# if path starts with a
+/
push @arr, join q{/}, @elems;
pop @elems;
}
say $path;
print Data::Dumper->Dumpxs( [ \ @arr ], [ qw{ *arr } ] );
say q{-} x 30;
}
The output.
/abc/def/ghi
@arr = (
'/abc/def/ghi',
'/abc/def',
'/abc'
);
------------------------------
/wxy/z
@arr = (
'/wxy/z',
'/wxy'
);
------------------------------
/usr/local/lib/x86_64-linux-gnu/perl/5.26.1
@arr = (
'/usr/local/lib/x86_64-linux-gnu/perl/5.26.1',
'/usr/local/lib/x86_64-linux-gnu/perl',
'/usr/local/lib/x86_64-linux-gnu',
'/usr/local/lib',
'/usr/local',
'/usr'
);
------------------------------
bin/fred
@arr = (
'bin/fred',
'bin'
);
------------------------------
somefile
@arr = (
'somefile'
);
------------------------------
A little more long-winded but possibly simpler to understand. I hope this is helpful.
| [reply] [d/l] [select] |
Re: Breaking file path into segments
by Marshall (Canon) on Apr 15, 2019 at 04:29 UTC
|
Another idea for you:
I am not sure if these paths are absolute or relative paths.
Update: Modified code to preserve initial "root path". Handles 3 cases now.
Modify the code below accordingly.
#!/usr/bin/perl
use strict;
use warnings;
my @paths = qw{
/abc/def/ghi
/wxy/z
/usr/local/lib/x86_64-linux-gnu/perl/5.26.1
bin/fred
somefile
./another_file
};
foreach my $path (@paths)
{
my $root_path = '';
$root_path = $1 if ($path =~ s/^([.\/]*)(.*)/$2/);
my @components = split '/',$path;
while (@components)
{
print $root_path,join('/',@components), "\n";
pop @components;
}
}
__END__
/abc/def/ghi
/abc/def
/abc
/wxy/z
/wxy
/usr/local/lib/x86_64-linux-gnu/perl/5.26.1
/usr/local/lib/x86_64-linux-gnu/perl
/usr/local/lib/x86_64-linux-gnu
/usr/local/lib
/usr/local
/usr
bin/fred
bin
somefile
./another_file
| [reply] [d/l] |
|
|