comment on

I use the following module to parse COBOL copy structures and to create the appropriate decoders:

package PIC;
use strict;

use Encode qw(decode encode);
use POSIX qw(ceil);
use Carp qw(croak);
use vars qw(%parse_PIC);

=head2 C<decode_COMP3>

Decodes a BCD number with trailing sign.

=cut

sub decode_COMP3 {
    # 0C -> +
    # 0D -> -
    # 0F -> (unsigned)
    my $digits = unpack "H*", $_[0];
    my $sign = chop $digits;
    #print "$digits\n";
       if ($sign eq 'D' or $sign eq 'd') { $sign = '-' }
    elsif ($sign eq 'C' or $sign eq 'c') { $sign = '+' }
    elsif ($sign eq 'F' or $sign eq 'f') { $sign = ' ' }
    else { $digits .= $sign; $sign = '?' };
    "$sign$digits"
};

=head2 C<make_COMP3_decoder PRE, POST, SEP>

Returns a subroutine that parses the digits
into PRE places before and POST places after the
decimal delimiter
and returns a string with SEP as the decimal delimiter

=cut

sub make_COMP3_decoder {
    my ($pre,$post,$sep) = @_;
    $sep ||= ",";
    return sub {
        my $res = decode_COMP3($_[0]);
        substr($res,$pre+2,0) = $sep;
        $res
    };
};

=head2 C<decode_EBCDIC>

Decodes an EBCDIC string to a Perl internal string

Shorthand for

    decode( 'cp1047', $_[0] )

=cut

sub decode_EBCDIC { decode( 'cp1047', $_[0] )};

# The patterns for things I recognize in the COBOL copy structures I e
+ncounter
%parse_PIC = (
    qr/^PIC\s+S?9\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC 
+},
    qr/^PIC\s+X\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC },
    qr/^PIC\s+S?9\s*\((\d+)\)\s*COMP-3$/ => sub { return ceil(($1+1) /
+2), \&decode_COMP3 },
    qr/^PIC\s+S?9\s*\((\d+)\)V9\((\d+)\)\s*(?:USAGE\s+)?COMP-3$/ => su
+b { return ceil(($1+$2+1)/2), make_COMP3_decoder($1,$2,$_[0]) },
    qr/^PIC\s+S?9\s*\((\d+)\)V\s*COMP-3$/ => sub { return ceil(($1+1) 
+/2), \&decode_COMP3 },
    qr/^PIC\s+S?9\s*\((\d+)\)\s+OCCURS\s+(\d+)$/ => sub { return $1*$2
+, \&decode_EBCDIC },
);

=head2 C<decode_PIC>

Returns two values:

=over 4

=item * Length in bytes of the PIC expression

=item * Code reference that decodes the passed value according to the 
+copy structure

=back

=cut

sub decode_PIC {
    my ($pic,$sep) = @_;
    for my $re (keys %parse_PIC) {
         if ($pic =~ /$re/) {
             return ($parse_PIC{$re}->($sep));
         };
    };
    croak "Couldn't decode '$pic'";
};

=head2 C<PIC_length>

Returns the length of the expression in bytes

=cut

sub PIC_length {
    my ($pic) = @_;
    my ($len,$decoder) = decode_PIC($pic);
    $len
};

1;
[download]

In reply to Re: EBCDIC and COBOL records by Corion
in thread EBCDIC and COBOL records by plegall

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


go ahead... be a heretic
	PerlMonks