http://qs321.pair.com?node_id=112685


in reply to char2oem

The solution above recommending code pages and unicode sounds robust and I would explore using a similar solution. However, just for grins, here is a poor man's version that might work for you. I just created a mapping of the DOS extended characters to the Windows ASCII extended characters and use that mapping to replace the appropriate characters. It's not overly robust and I'm not 100% sure it's correct (since I did the mapping myself by looking at DOS and ASCII character charts). Your mileage may vary.

Update: Note that there is not a 1 to 1 mapping between the DOS and ANSI code pages. I mostly just mapped the accented letters and a couple of symbols (such as "cents" etc.) For most text (including Latin foreign languages) this mapping should work fairly well. However, it's not a very robust solution and not very pretty code so if you need to do a lot of this I would recommend one of the other solutions suggested in this thread.

Update #2 In case it's not clear, the hash %asc2dos maps the ANSI (e.g. Windows) ASCII value to the equilavent DOS ASCII value. I then reverse the hash so %dos2asc contains the mapping from DOS back to Windows. As an aside, does anyone have any suggestions for a better or more idiomatic way to reverse the hash (i.e. use the keys as values and vice versa) than what I did below?
#!/usr/local/bin/perl use strict; use warnings; #mapping of ASCII to DOS my %asc2dos = (131,159,149,250,150,196,161,173,162,155,163,156,165,157 +,170,166,171,174,172,170,176,248,177,241,178,253,183,249,186,167,187, +175,188,172,189,171,191,168,196,142,197,143,198,146,199,128,201,144,2 +09,165,214,153,220,154,223,225,224,133,225,160,226,131,228,132,229,13 +4,230,145,231,135,232,138,233,130,234,136,235,137,236,141,237,161,238 +,140,239,139,241,164,242,149,243,162,244,147,246,148,247,246,249,151, +250,163,251,150,252,129,255,152); #create the reverse mapping (DOS to ASCII) my %dos2asc; foreach my $key (sort keys %asc2dos) { $dos2asc{$asc2dos{$key}} = $key; } #here's a test: #create a string with some accented characters my $string = pack("C10",223,224,225,232,231,236,237,241,243,244); print "ASCII string = $string\n"; $string = asc2dos($string); print "DOS string = $string\n"; $string = dos2asc($string); print "ASCII string = $string\n"; #convert ASCII extended characters to DOS extended characters sub asc2dos { my $str = shift; foreach my $i (0..length($str)-1) { my $val = ord substr($str,$i,1); substr($str,$i,1) = chr $asc2dos{$val} || $val; } return $str; } #convert DOS extended characters to ASCII characters sub dos2asc { my $str = shift; foreach my $i (0..length($str)-1) { my $val = ord substr($str,$i,1); substr($str,$i,1) = chr $dos2asc{$val} || $val; } return $str; }