Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Detecting ASCII Characters embedded within text string

by onegative (Scribe)
on Jan 28, 2011 at 16:25 UTC ( [id://884871]=perlquestion: print w/replies, xml ) Need Help??

onegative has asked for the wisdom of the Perl Monks concerning the following question:

Good day Monks,

I have a problem and not sure how to handle. Seems like I am finding ASCII characters embedded within text strings, mostly likely introduced during cut/paste into application fields during configurations. These are unseen within the application fields and typically show up as a space and not easily identified visually.

The problem is that the data once retrieved from the configurations are being passed into my code and when I build the xml (even CDATA) it is cratering the XML parsers which later process the file.

My guestion is how would I globally detect that an ASCII character is embedded in the string and either remove or convert accordingly. Not knowing what may be there is the challenge for me...and how to address it through some function to eliminate or convert.

Any ideas or best practice would be GREATLY appreciated.

Thanks,
Danny
  • Comment on Detecting ASCII Characters embedded within text string

Replies are listed 'Best First'.
Re: Detecting ASCII Characters embedded within text string
by Anonyrnous Monk (Hermit) on Jan 28, 2011 at 16:32 UTC

    You probably don't really mean ASCII when you say ASCII, but rather "control characters", or some such.

    As for replacing certain characters, or ranges of characters, see tr, or s  (update: here are maybe more useful links, as the rather large section "Quote and Quote-like Operators" that the respective entries in perlfunc refer you to, might be somewhat distracting for the uninitiated: tr, s).

    For detecting them, maybe something like this  (the set [^\x20-\x7e] denotes characters not in the range hex 20-7e (decimal 32-126) ):

    my $s = "foo \x03 bar \x05 baz"; printf "detected strange char: 0x%x\n", ord for $s =~ /[^\x20-\x7e]/g +; __END__ detected strange char: 0x3 detected strange char: 0x5

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://884871]
Approved by Anonyrnous Monk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-04-19 14:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found