Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^2: strip html tags and special characters in perl while inserting the text in to database.

by valavanp (Curate)
on Jun 10, 2007 at 17:12 UTC ( [id://620332]=note: print w/replies, xml ) Need Help??


in reply to Re: strip html tags and special characters in perl while inserting the text in to database.
in thread strip html tags and special characters in perl while inserting the text in to database.

Hi monks, First of all thanks for the suggestions. I tried the following code from the links which are provided by you monks:
use strict; use warnings; use Encode qw( _utf8_on ); my $resume = "”"; print $resume, "\n"; _utf8_on($resume); print $resume, "\n";
When i execute the above code it gives me the same output in both print statements. I want the corresponding special character for the $resume variable. Please correct me if i am wrong in the above code. Thanks.
  • Comment on Re^2: strip html tags and special characters in perl while inserting the text in to database.
  • Download Code

Replies are listed 'Best First'.
Re^3: strip html tags and special characters in perl while inserting the text in to database.
by graff (Chancellor) on Jun 13, 2007 at 04:08 UTC
    I want the corresponding special character for the $resume variable.

    I don't understand what that means. Can you explain more carefully what you really want? Also, can you please try to be more clear about what is being assigned as the value of $resume?

    It actually seems that you are assigning a three-byte value:  "\xE2\x80\x9D" -- this happens to be interpretable as the utf8 encoding for the unicode character U+201D "RIGHT DOUBLE QUOTATION MARK". Do you want to replace this with the ASCII double-quote character?

    my $resume = "\x{201D}"; print "$resume\n"; $resume =~ s/\x{201d}/"/g; print "$resume\n";
    (updated to make sure the s/// applies to the value of $resume)

    To do that sort of replacement in a "general" sense (i.e. replace all "wide-character" versions of punctuation marks with ASCII versions of same wherever possible), you probably want Text::Unidecode:

    #!/usr/bin/perl use strict; use Text::Unidecode; my $resume = "\x{201d}"; print unidecode( $resume ), "\n";

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://620332]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-19 09:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found