Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Breaking file path into segments

by cherio (Novice)
on Apr 12, 2019 at 20:11 UTC ( [id://1232511]=perlquestion: print w/replies, xml ) Need Help??

cherio has asked for the wisdom of the Perl Monks concerning the following question:

I have file path that has one or more directories and like this "/abc/def/ghi". I need a regex that can break it into an array as follows
( /abc/def/ghi, /abc/def, /abc )
alternatively a reversed array will do as well
( /abc, /abc/def, /abc/def/ghi )
The following regex breaks path into up to 3 segments
my $path = "/abc/def/ghi"; my @segments = ($path =~ m'^(/[^/]+ (/[^/]+ (/[^/]+)? )? )'x);
This regex is rigid. I can make it long enough to be able to break path into N segments but if the actual path has N+1 segments the regex won't work.

I want a regex that can break path into variable number of segments and not hardcode maximum path length into it.

Replies are listed 'Best First'.
Re: Breaking file path into segments
by choroba (Cardinal) on Apr 12, 2019 at 20:23 UTC
    I prefer Path::Tiny to handle paths for me:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Path::Tiny; my $path = '/abc/def/ghi'; my $climb = path($path); my @paths = $climb; push @paths, $climb until ($climb = $climb->parent) eq '/'; say for @paths;

    File::Spec is a core module that can handle the path as well:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use File::Spec; my $path = '/abc/def/ghi'; my @paths = 'File::Spec'->splitdir($path); for my $i (reverse 0 .. $#paths) { $paths[$i] = join '/', @paths[0 .. $i]; } shift @paths; # Remove the empty path. say for @paths;

    But a regex with split will work similarly:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Path::Tiny; my $path = '/abc/def/ghi'; my @paths = split m{(?<=.)/}, $path; for my $i (reverse 0 .. $#paths) { $paths[$i] = join '/', @paths[0 .. $i]; } say for @paths;

    I used a lookbehind assertion to skip the empty path before the first slash.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      File::Spec is a standard module and comes with Perl. For a list of standard modules, see perldoc perlmodlib

      If your code is going to be run on a different machine than your development one, using standard modules means you don't have to distribute and maintain additional code (the modules).

        > you don't have to distribute and maintain additional code

        Unless you work on a RedHat based Linux distribution where you need to install the perl-PathTools package. Installing perl-Path-Tiny is comparably complex.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Breaking file path into segments
by holli (Abbot) on Apr 12, 2019 at 20:29 UTC
    use Path::Class; print join "#", dir("abc/def/ghi")->components;


    holli

    You can lead your users to water, but alas, you cannot drown them.
Re: Breaking file path into segments (updated x2)
by AnomalousMonk (Archbishop) on Apr 12, 2019 at 21:38 UTC

    I agree with others that a path-processing module is the way to go here, but if you just gotta have a regex, here's another:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "print qq{perl version: $]}; ;; my $path = '/abc/def/ghi'; ;; local our @dirs; $path =~ m{ (\A (?: / [^/]+ \b)+) (?{ push @dirs, $^N }) (?!) }xms; ;; dd \@dirs; " perl version: 5.008009 ["/abc/def/ghi", "/abc/def", "/abc"]
    With Perl version 5.10+, the odd-looking  (?!) can become  (*FAIL) from Special Backtracking Control Verbs in perlre (but the compiler will optimize it to (*FAIL) anyway). With version 5.18+ (IIRC), the package-global array  @dirs can become a my array; that bug was fixed.

    Update 1: On second thought, the  \b anchor is a bit too alnum-specific: make it  (?! [^/]) instead:
        $path =~ m{ (\A (?: / [^/]+ (?! [^/]))+) (?{ push @dirs, $^N }) (?!) }xms;

    Update 2: And if you want to reverse the order of the pieces, use a lazy quantifier:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "print qq{perl version: $]}; ;; my $path = '/=abc/def/ghi=/---'; ;; local our @dirs; $path =~ m{ (\A (?: / [^/]+ (?! [^/]))+?) (?{ push @dirs, $^N }) (?!) + }xms; ;; dd \@dirs; " perl version: 5.008009 ["/=abc", "/=abc/def", "/=abc/def/ghi=", "/=abc/def/ghi=/---"]


    Give a man a fish:  <%-{-{-{-<

Re: Breaking file path into segments
by johngg (Canon) on Apr 13, 2019 at 11:01 UTC

    An alternative to doing the whole thing with a regex would be to split into individual path elements then push joined elements onto the array, popping elements off the end until there's nothing left.

    use 5.026; use warnings; use Data::Dumper; my @paths = qw{ /abc/def/ghi /wxy/z /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 bin/fred somefile }; foreach my $path ( @paths ) { my @elems = split m{/}, $path; my @arr; while ( @elems ) { last if @elems == 1 && ! $elems[ 0 ]; # Ignore empty first ele +ment # if path starts with a +/ push @arr, join q{/}, @elems; pop @elems; } say $path; print Data::Dumper->Dumpxs( [ \ @arr ], [ qw{ *arr } ] ); say q{-} x 30; }

    The output.

    /abc/def/ghi @arr = ( '/abc/def/ghi', '/abc/def', '/abc' ); ------------------------------ /wxy/z @arr = ( '/wxy/z', '/wxy' ); ------------------------------ /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 @arr = ( '/usr/local/lib/x86_64-linux-gnu/perl/5.26.1', '/usr/local/lib/x86_64-linux-gnu/perl', '/usr/local/lib/x86_64-linux-gnu', '/usr/local/lib', '/usr/local', '/usr' ); ------------------------------ bin/fred @arr = ( 'bin/fred', 'bin' ); ------------------------------ somefile @arr = ( 'somefile' ); ------------------------------

    A little more long-winded but possibly simpler to understand. I hope this is helpful.

    Cheers,

    JohnGG

Re: Breaking file path into segments
by Marshall (Canon) on Apr 15, 2019 at 04:29 UTC
    Another idea for you:
    I am not sure if these paths are absolute or relative paths.
    Update: Modified code to preserve initial "root path". Handles 3 cases now.
    Modify the code below accordingly.
    #!/usr/bin/perl use strict; use warnings; my @paths = qw{ /abc/def/ghi /wxy/z /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 bin/fred somefile ./another_file }; foreach my $path (@paths) { my $root_path = ''; $root_path = $1 if ($path =~ s/^([.\/]*)(.*)/$2/); my @components = split '/',$path; while (@components) { print $root_path,join('/',@components), "\n"; pop @components; } } __END__ /abc/def/ghi /abc/def /abc /wxy/z /wxy /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/lib/x86_64-linux-gnu/perl /usr/local/lib/x86_64-linux-gnu /usr/local/lib /usr/local /usr bin/fred bin somefile ./another_file

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1232511]
Approved by dorko
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-24 06:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found