Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

AWTDI: Renaming files using regexp

by nimdokk (Vicar)
on Apr 11, 2006 at 13:54 UTC ( [id://542537]=perlquestion: print w/replies, xml ) Need Help??

nimdokk has asked for the wisdom of the Perl Monks concerning the following question:

I had a need to rename files that have periods in the filename to use an underscore instead of a period but leave the period separating the file extension alone. I have a perfectly good solution that does exactly what I need it to do. However, I'm wondering if there isn't another, simpler solution instead of the three lines that I could use instead (some sort of one-line solution). The question is more academic than anything since I have an answer, just looking to expand my knowledge some :-)

The code sample below is very very stripped down to just the core elements:

use File::Basename; foreach (<DATA>) { chomp $_; ($file,$base,$ext)=fileparse($_,qr/.[^.]*/); $file =~ tr/\./\_/; print "Change: $_ to ${file}${ext}\n"; } __DATA__ test0.file0.new_20060411.zip test1.file1.new_20060411.zip test2.file2.new_20060411.zip test3.file3.new_20060411.zip

Any insight's would be appreciated.

Replies are listed 'Best First'.
Re: AWTDI: Renaming files using regexp
by duff (Parson) on Apr 11, 2006 at 15:03 UTC

    Just because I haven't seen anyone suggest it yet, you should be able to use substr and rindex to good effect here:

    for (@files) { my $copy = $_; substr($_,0,rindex($_,".")) =~ tr/./_/; print "Change: $copy to $_\n"; }
Re: AWTDI: Renaming files using regexp
by jdporter (Paladin) on Apr 11, 2006 at 14:37 UTC
    # convert the next dot to an underscore, as long as there's at least t +wo dots s/\./_/ while /\..*\./;
    We're building the house of the future together.
Re: AWTDI: Renaming files using regexp
by BrowserUk (Patriarch) on Apr 11, 2006 at 14:41 UTC

    Similar to jdporter's solution, but let's the regex engine do the looping.

    s[\.(?=.*\.)][_]g, print while <DATA>;

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: AWTDI: Renaming files using regexp
by johngg (Canon) on Apr 11, 2006 at 14:45 UTC
    You can achieve this with a global substitution using regular expression look ahead. Like this

    use strict; use warnings; while(<DATA>) { chomp; my $old = $_; s/\.(?=.*?\.)/_/g; print "Change: $old to $_\n"; } __END__ test0.file0.new_20060411.zip test1.file1.new_20060411.zip test2.file2.new_20060411.zip test3.file3.new_20060411.zip

    produces

    Change: test0.file0.new_20060411.zip to test0_file0_new_20060411.zip Change: test1.file1.new_20060411.zip to test1_file1_new_20060411.zip Change: test2.file2.new_20060411.zip to test2_file2_new_20060411.zip Change: test3.file3.new_20060411.zip to test3_file3_new_20060411.zip

    The substitution globally replaces every dot with an underscore as long as that dot is followed somewhere later in the line by another dot. Thus, it will not replace the dot that precedes the extension as it is the last one.

    I hope this helps.

    Cheers,

    JohnGG

Re: AWTDI: Renaming files using regexp
by davidrw (Prior) on Apr 11, 2006 at 14:48 UTC
    Here's a solution using (positive & negative) look-aheads in a regex substitution..
    first, note the extra tests I added to make sure it doesn't modify part of the path or a single-dot filename. Also note that the second regex assumes that hte directory separator is /.
    while(<DATA>){ # s/\.(?=.*?\.)/_/g; # can you this if you know you have onl +y the filename s#\.(?!.*?/)(?=.*?\.)#_#g; # This one accounts for paths, and modi +fies only the filename print; } __DATA__ test.zip blah.blah.ext/stuff.bar/test0.file0.new_20060411.zip test1.file1.new_20060411.zip test2.file2.new_20060411.zip test3.file3.new_20060411.zip
    so, as a one-liner to generate the mv commands to paste into a shell:
    ls | perl -ne 'chomp; $f=$_; s#\.(?!.*?/)(?=.*?\.)#_#g; print "mv $f $ +_\n";'
    Update: See ikegami's warning about '. ext'
Re: AWTDI: Renaming files using regexp
by rinceWind (Monsignor) on Apr 11, 2006 at 14:53 UTC

    For more complicated problems, including traversing whole directory trees, you might be interested in File::Wildcard, which can construct the destination filename for you out of regexp captures, $1, $2, etc.

    --

    Oh Lord, won’t you burn me a Knoppix CD ?
    My friends all rate Windows, I must disagree.
    Your powers of persuasion will set them all free,
    So oh Lord, won’t you burn me a Knoppix CD ?
    (Missquoting Janis Joplin)

Re: AWTDI: Renaming files using regexp
by Not_a_Number (Prior) on Apr 11, 2006 at 15:43 UTC

    ...and in the following line you escaped your dot unnecessarily:

    my $str = '1.2.3.4.5'; $str =~ tr/./_/; # No need to escape '.' print $str;
Re: AWTDI: Renaming files using regexp
by ikegami (Patriarch) on Apr 11, 2006 at 15:34 UTC

    You forgot to escape your dot. It will match any character, not just a dot.

    Furthermore, some files have dots, but no extentions. For example, consider "Foo. Bar". ". Bar" is not an extention because it has a space in it. You can verify this by checking the properties of "Foo. Bar" and "Foo.Bar". For the former, the file type is "File", while for the latter, the file type in "BAR File". You (and everyone in this thread) mistakenly identify ". Bar" as an extention and don't convert "Foo. Bar" to "Foo_ Bar".

    The correct usage would be:
    fileparse($_, qr/\.[^. ]*/)

      Did you look (I mean actually look) at his sample data?

      If you are coding a generic tool for dealing with filenames of unknown formats, then such critisms are valid.

      But if the guy knows what his data looks like; he as a working solution for his problem, as opposed to the arbitrarially extended problem you have imposed upon his question; and he states:

      The question is more academic than anything since I have an answer, just looking to expand my knowledge some :-)

      then your admonishment of the OP (for his working solution), everyone in this thread is ... ...!

      I'll let you fill in the blanks


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        One of the ideas of this site is to intruct not only the OP, but the people who might be searching through the site at a later time. The next person to use this site may have different data.

        And yes, I did expand knowledge (and started a discussion which will expand it further).

      Can you point to a definitive definition of "filename extension" that says that spaces aren't allowed? Are there any other characters that are not valid in the "extension"?

      You suggest a method of verification that assumes a particular operating system I think as the OS that I typically use says "unknown" for the file type of both files named Foo.Bar and Foo. Bar

      When someone talks about "filename extensions", are they implicitly referring to a particular operating system? I don't think so as the term, while it originated on systems that were hobbled by their choice of filesystem implementation, is used today when referring to filenames that were constructed on any operating system/file system.

        I admit, I was using the traditional definition which doesn't apply to *ix. So let's discuss this modern definition, one where file extentions are can be specific to any of file systems, operating systems and applications.

        Applications need a way of knowing if an extention was supplied. For example, it needs to know that to decide whether it should add the default extention. I can think of two ways:

        • Check a database of registered extentions.
        • Check if the filename looks like it has an extention.

        The first solution has two problems: 1) It's slow, and 2) it requires that all extentions be registered.

        The second solution might be wrong on rare occasions, but it doesn't have the problems of the first solution. The catch is that it must restrict the characters that can be present in extentions. Most file names with non-extention dots have spaces following their dots, so the simplest restriction is to forbid spaces in exentions. This prevents applications from thinking file "This is private. Don't read" has an extention.

        Therefore, my recommendations is to disallow spaces in extentions, even if the underlying OS or file system doesn't impose such a restriction.

      That is a good point and I'll take it into consideration. However, I will admit that the fileparse line was copied straight out of the Camel book. And yes, I know, I should not copy things without fully understanding them :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://542537]
Approved by prasadbabu
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-24 05:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found