http://qs321.pair.com?node_id=338447

TriggerDunpoe has asked for the wisdom of the Perl Monks concerning the following question:

I have
$a = "dfowusgfiwu42394353****hsdfsd"; print "$a\n"; $a =~ s/!([a-zA-Z0-9])//g; print "$a\n";
what's not right? it doesn't do anything to the string... and I know it looks useless, but it's a test for something else. what I want is to delete any and all characters from a string that don't fit into [a-zA-Z0-9]... unless things like "$(@$&%*" are in there...

Replies are listed 'Best First'.
Re: leaving only [a-zA-Z0-9]
by dvergin (Monsignor) on Mar 21, 2004 at 14:44 UTC
    The other responses here give good solutions. But here's what went wrong with your approach.

    You said /!([a-zA-Z0-9])/ in the regex part of your substitution, by which you meant, "match anything that is not a letter or number". But what you wrote actually means "match an exclamation point followed by any letter or number". Clearly not what you wanted.

    The way to say what you wanted that is closest to what you wrote is, /([^a-zA-Z0-9])/. The caret as the first character in the character class does what you wanted the excalmation point to do. And the parentheses do nothing for you in the current instance so you can leave them out.

    The tr/// example was recommended to you because, for tasks like this, it is faster, conceptually cleaner, and more idiomatic.

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

Re: leaving only [a-zA-Z0-9]
by BrowserUk (Patriarch) on Mar 21, 2004 at 13:51 UTC

    See perlop "Quote-like operators".

    $a =~ tr[A-Za-z0-9][]cd;

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
Re: leaving only [a-zA-Z0-9]
by neniro (Priest) on Mar 21, 2004 at 13:58 UTC
    you can also use $a =~ s/\W//g;

    If you use locales; it supports special chars too.

      you can also use $a =~ s/\W//g;

      make that   s/[\W_]//g;   to take away the   _   too.

      And careful with   locale!   That will not get rid of Umlauts and word characters with accents and such. It will still get rid of special characters though.

      Sören

        "That will not get rid of Umlauts and word characters with accents and such." - that are the specialchars I've meant.
        Sorry I should have been more detailed in my explanation.