| [reply] |
I cannot repeat your problem. However, please see if no locale doesn't help. Please also study perlunicode; someone else may have a better answer.
Works for me:
$ perl -le 'print "hello\253there"' | perl -pe 'tr/\253//d'
hellothere
| [reply] [d/l] [select] |
$_="Canciones\\251STAMPID\\253\\277De quien es la cancion \"STAND BY M
+E\"*4 the cause";
s/\\[0-9][0-9][0-9]//g;
print;
| [reply] [d/l] |
Thanks for the help!
Actually, the above doesn't work, because '\251' (and all the other similarily structured codes) were being interpreted by perl as A SINGLE CHARACTER. Which is weird.
However, it turns out I've found a solution. The problem was that the file was not, in fact encoded in UTF-8, but was encoded in Western(ISO-8859-1).
I used xemacs to translate the page into UTF-8, and my problems more or less disappeared -- well, perl finally, grudgingly decided to recognize all the odd characters and I was able to get some useful work done!
Thanks again for the help!
| [reply] |
could it be that my text document is DOS formatted? Perl does not seem to be recognizing the UTF codes at all. I cannot do anything to access them, and when I try to manipulate the line, most of the time I get a line like this
Malformed UTF-8 character (overflow at 0xa0c75a60, byte 0x70, after start byte 0xbf) in uc at ./qNa.pl line 15, <IN> line 25.
joe | [reply] |
$a=chr(0x74);
print utf8::is_utf8($a)?"yes":"no";
$a=chr(0x470);
print utf8::is_utf8($a)?"yes":"no";
the output is "noyes" ... if you can provide some more code i can give you a more specific answer | [reply] [d/l] |
thanks! i also tried to convert from dos2unix, but that did nothing to solve my problem :-(
| [reply] |
In the example line I gave before:
Canciones\251STAMPID\253\277De quien es la cancion "STAND BY ME"*4 the cause
I would simply like to delete the '\253' from the line (there is more that I will eventually want to do, but if I could complete this simple action, the rest ought to be a piece of cake. my first attempt at this was:
$_ = s/\\253//g;
This failed miserably. The problem is that the '\253' is being treated as a single character (i.e., if I try to highlight just one digit, it highlights the entire 4 digit string)
I'm trying to write a c++ program to convert the codes to ASCII.
| [reply] |