http://qs321.pair.com?node_id=425633


in reply to Re^2: Removing Foreign Characters
in thread Removing Foreign Characters

The other solutions are to solve character encoding issues: you can have different binary sequences to mean the same character.

For example: e acute might be one binary sequence in latin1, and a differnt binary sequence in UTF8 (and is, in fact).

The problem with what you are trying to do, is that it is not translating between different representations of the same character (what people immediately think of) - you want to translate one character (e acute) into a totally different one (e no acute).

I have some code to do this, but sadly not with me. I could post or mail it at the weekend.

c.

VGhpcyBtZXNzYWdlIGludGVudGlvbmFsbHkgcG9pbnRsZXNz