Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: character substitution in search

by lhoward (Vicar)
on Jun 19, 2000 at 19:45 UTC ( [id://18814]=note: print w/replies, xml ) Need Help??


in reply to character substitution in search

I have a nasty looking translation statment that "de-accents" Latin 1 characters. I wrote it for indexing and searching web-pages in an accent free way. The important thing to remember is that you must de-accent both the search-term and the text being searched against. Also that this code will only work for the Latin-1 character set.

Here's my tr statment (I'm not using the code tag on purpose 'cause othewise the line will be too long. I've also inserted spaces to help the wrapping... you should remove them if you use this statment):

tr/\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD \xCE\xCF\xD0\xD1\xD2\xD3\xD4\xD5\xD6\xD8\xD9\xDA\xDB\xDC \xDD\xDF\xE0\xE1\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xEA\xEB \xEC\xED\xEE\xEF\xF1\xF2\xF3\xF4\xF5\xF6\xF8\xF9\xFA\xFB \xFC\xFD\xFF/\x41\x41\x41\x41\x41\x41\x41\x43\x45\x45\x45 \x45\x49\x49\x49\x49\x44\x4E\x4F\x4F\x4F\x4F\x4F\x4F\x55 \x55\x55\x55\x59\x73\x61\x61\x61\x61\x61\x61\x61\x63\x65 \x65\x65\x65\x69\x69\x69\x69\x6E\x6F\x6F\x6F\x6F\x6F\x6F \x75\x75\x75\x75\x79\x79/;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://18814]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-03-29 07:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found