http://qs321.pair.com?node_id=154409

vivekvp has asked for the wisdom of the Perl Monks concerning the following question:

Help! I am being accused of loading data with control characters! I need to a program that will check a file for any control character and one that would strip any control character. Any help? He who laughs last, doesn't get the joke.

Replies are listed 'Best First'.
Re: VVP:Character code Accusation!
by Corion (Patriarch) on Mar 26, 2002 at 15:47 UTC

    Depending on what you consider a "control character", the below snippets will or won't do what you need :

    perl -pe "s/[\000-\037]//g"

    This will delete all characters with a character code below 32 (space) from the file. Note that also carriage return and linefeed fall under this.

    And here's the checker to see if your file contains bad characters.

    perl -ne "die q(Bad character on line $.\n) if /[\000-\037]/"

    If you want to exclude certain characters from that range (for example CR and LF), you must create the subranges around the characters you don't want.

    Update: Here are the REs modified to allow CR and LF in your file :

    perl -pe "s/[\000-\011\013\014\016-\037]//g" perl -ne "die q(Bad character on line $.\n) if /[\000-\011\013\014\016 +-\037]/"

    Note that maybe the "wrong" characters also occur, because you are transferring your files via ftp in ASCII mode instead of binary mode (or binary mode instead of ASCII mode) or from an EBCDIC system to an ASCII system or vice versa without the proper conversion.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: VVP:Character code Accusation!
by ChOas (Curate) on Mar 26, 2002 at 15:56 UTC
    <HUMOR>
    Since you said you needed 2 programs: 1 to check the file
    for control characters, AND one for stripping them, I would
    advice the diff command for the first one.

    1st run one of the programs already mentioned by my fellow
    monks to strip the characters.

    2nd: run the 'diff' program on the 2 files (your original,
    and the file resulting from the stripping of the characters).

    If there really were control characters in
    your file, diff will have output...

    </HUMOR>

    GreetZ!,
      ChOas

    print "profeth still\n" if /bird|devil/;
Re: VVP:Character code Accusation!
by Fletch (Bishop) on Mar 26, 2002 at 15:39 UTC
    man od man tr

    perldoc perlop, search for tr///