Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^3: Removing Foreign Characters

by g0n (Priest)
on Jan 27, 2005 at 17:17 UTC ( [id://425633]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Removing Foreign Characters
in thread Removing Foreign Characters

The other solutions are to solve character encoding issues: you can have different binary sequences to mean the same character.

For example: e acute might be one binary sequence in latin1, and a differnt binary sequence in UTF8 (and is, in fact).

The problem with what you are trying to do, is that it is not translating between different representations of the same character (what people immediately think of) - you want to translate one character (e acute) into a totally different one (e no acute).

I have some code to do this, but sadly not with me. I could post or mail it at the weekend.

c.

VGhpcyBtZXNzYWdlIGludGVudGlvbmFsbHkgcG9pbnRsZXNz

Replies are listed 'Best First'.
Re^4: Removing Foreign Characters
by existem (Sexton) on Jan 27, 2005 at 18:05 UTC

    thanks for the help guys, I think i've kind of hacked this one ;) here's what i've done.

    This is actually PHP, I did it on the front end, rather than at the point of loading into the database.

    $trans = array( "À" => "À", "à" => "à", "Á" => "Á", "á" => "á", "Ã" => "Ã", "Ì" => "Ì", "ì" => "ì", "Í" => "Í", "í" => "í", "Î" => "Î", "î" => "î", "Ò" => "Ò", "ò" => "ò", "Ó" => "Ó", "ó" => "ó", "Ô" => "Ô", "ô" => "ô", "é" => "é", "è" => "è", "È" => "È", "Ù" => "Ù", "ù" => "ù", "Ú" => "Ú", "ú" => "ú", "Û" => "Û", "û" => "û", "¢" => "", "©" => "" ); $name = strtr($row["name"], $trans2);

    I'm off to the pub now to think about it a bit more ;)

Re^4: Removing Foreign Characters
by g0n (Priest) on Jan 28, 2005 at 10:43 UTC
    Quick poll of opinion:

    I've had a requirement to do this a couple of times, and evidently existem has now too. Is it worth modularising this for different encoding schemes as Text::StripAccent or some such name?

    VGhpcyBtZXNzYWdlIGludGVudGlvbmFsbHkgcG9pbnRsZXNz

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://425633]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2024-03-28 09:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found