Hello, I'm looking for a Win-API-like function
to convert a string (i. e. with ANSI char's) into
OEM-char's.
If your conversion program is running on Windows, you can
get the AnsiToOEM Win32 API function in Perl via the
Win32::API
module. However, a more portable (and perhaps even
simplier, considering the complexity of using Win32::API)
solution is to write a function to convert
CP1252
(WinLatin1) to
CP437
(DOSLatinUS). To do this download the textfiles from those
links at Czyborra and write a program like this:
my ($from_name, $to_name,
$from_map, $to_map);
$from_name = "cp1252.txt";
$to_name = "cp437.txt";
# Load a codepage file into a Perl data structure
sub load_cp
{
my ($filename) = @_;
my (@map);
open(CP, "<$filename") || die "load_cp: $filename: $!";
while(<CP>)
{
my ($byte, $unicode) = m/^=(..)\tU[+](....)/;
$map[hex $byte] = hex($unicode);
}
return \@map;
}
# Map characters in a file from one codepage to another
sub map_cp
{
my ($to, $from, $text) = @_;
my ($new_text, $map, %to, %from, @text);
# First map codepage "$from" characters to Unicode characters
foreach my $char (split //, $text)
{
if (!defined $from->[ord $char])
{
warn "no to=$to_name char for U+$char\n";
}
push @text, $from->[ord $char];
}
# Now map non-ASCII Unicode characters to codepage "$to"
foreach my $char (@text)
{
if (!defined $to->[$char])
{
warn "no from=$from_name char for U+$char\n";
}
$new_text .= chr $to->[$char];
}
return $new_text;
}
# Load to and from codepages
$from_map = load_cp($from_name);
$to_map = load_cp( $to_name);
# Replace \x80 with your text
print map_cp($from_map, $to_map, "\x80");
Note that CP1252 has characters that CP437 doesn't have,
and vice versa. You may want to replace all your ANSI
character codes with UTF-8 Unicode characters. It makes
things much easier, and then you could use the cp437.txt
from Czyborra to generate text capable of being viewed
in a DOS environment with the CP437 codepage.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.