Hello, I'm looking for a Win-API-like function
to convert a string (i. e. with ANSI char's) into
OEM-char's.
If your conversion program is running on Windows, you can
get the AnsiToOEM Win32 API function in Perl via the
Win32::API
module. However, a more portable (and perhaps even
simplier, considering the complexity of using Win32::API)
solution is to write a function to convert
CP1252
(WinLatin1) to
CP437
(DOSLatinUS). To do this download the textfiles from those
links at Czyborra and write a program like this:
my ($from_name, $to_name,
$from_map, $to_map);
$from_name = "cp1252.txt";
$to_name = "cp437.txt";
# Load a codepage file into a Perl data structure
sub load_cp
{
my ($filename) = @_;
my (@map);
open(CP, "<$filename") || die "load_cp: $filename: $!";
while(<CP>)
{
my ($byte, $unicode) = m/^=(..)\tU[+](....)/;
$map[hex $byte] = hex($unicode);
}
return \@map;
}
# Map characters in a file from one codepage to another
sub map_cp
{
my ($to, $from, $text) = @_;
my ($new_text, $map, %to, %from, @text);
# First map codepage "$from" characters to Unicode characters
foreach my $char (split //, $text)
{
if (!defined $from->[ord $char])
{
warn "no to=$to_name char for U+$char\n";
}
push @text, $from->[ord $char];
}
# Now map non-ASCII Unicode characters to codepage "$to"
foreach my $char (@text)
{
if (!defined $to->[$char])
{
warn "no from=$from_name char for U+$char\n";
}
$new_text .= chr $to->[$char];
}
return $new_text;
}
# Load to and from codepages
$from_map = load_cp($from_name);
$to_map = load_cp( $to_name);
# Replace \x80 with your text
print map_cp($from_map, $to_map, "\x80");
Note that CP1252 has characters that CP437 doesn't have,
and vice versa. You may want to replace all your ANSI
character codes with UTF-8 Unicode characters. It makes
things much easier, and then you could use the cp437.txt
from Czyborra to generate text capable of being viewed
in a DOS environment with the CP437 codepage. | [reply] [d/l] |
The solution above recommending code pages and unicode sounds robust and I would explore using a similar solution. However, just for grins, here is a poor man's version that might work for you. I just created a mapping of the DOS extended characters to the Windows ASCII extended characters and use that mapping to replace the appropriate characters. It's not overly robust and I'm not 100% sure it's correct (since I did the mapping myself by looking at DOS and ASCII character charts). Your mileage may vary.
Update: Note that there is not a 1 to 1 mapping between the DOS and ANSI code pages. I mostly just mapped the accented letters and a couple of symbols (such as "cents" etc.) For most text (including Latin foreign languages) this mapping should work fairly well. However, it's not a very robust solution and not very pretty code so if you need to do a lot of this I would recommend one of the other solutions suggested in this thread.
Update #2 In case it's not clear, the hash %asc2dos maps the ANSI (e.g. Windows) ASCII value to the equilavent DOS ASCII value. I then reverse the hash so %dos2asc contains the mapping from DOS back to Windows. As an aside, does anyone have any suggestions for a better or more idiomatic way to reverse the hash (i.e. use the keys as values and vice versa) than what I did below?
#!/usr/local/bin/perl
use strict;
use warnings;
#mapping of ASCII to DOS
my %asc2dos = (131,159,149,250,150,196,161,173,162,155,163,156,165,157
+,170,166,171,174,172,170,176,248,177,241,178,253,183,249,186,167,187,
+175,188,172,189,171,191,168,196,142,197,143,198,146,199,128,201,144,2
+09,165,214,153,220,154,223,225,224,133,225,160,226,131,228,132,229,13
+4,230,145,231,135,232,138,233,130,234,136,235,137,236,141,237,161,238
+,140,239,139,241,164,242,149,243,162,244,147,246,148,247,246,249,151,
+250,163,251,150,252,129,255,152);
#create the reverse mapping (DOS to ASCII)
my %dos2asc;
foreach my $key (sort keys %asc2dos)
{
$dos2asc{$asc2dos{$key}} = $key;
}
#here's a test:
#create a string with some accented characters
my $string = pack("C10",223,224,225,232,231,236,237,241,243,244);
print "ASCII string = $string\n";
$string = asc2dos($string);
print "DOS string = $string\n";
$string = dos2asc($string);
print "ASCII string = $string\n";
#convert ASCII extended characters to DOS extended characters
sub asc2dos
{
my $str = shift;
foreach my $i (0..length($str)-1)
{
my $val = ord substr($str,$i,1);
substr($str,$i,1) = chr $asc2dos{$val} || $val;
}
return $str;
}
#convert DOS extended characters to ASCII characters
sub dos2asc
{
my $str = shift;
foreach my $i (0..length($str)-1)
{
my $val = ord substr($str,$i,1);
substr($str,$i,1) = chr $dos2asc{$val} || $val;
}
return $str;
}
| [reply] [d/l] |