http://qs321.pair.com?node_id=696464


in reply to EBCDIC and COBOL records

I use the following module to parse COBOL copy structures and to create the appropriate decoders:

package PIC; use strict; use Encode qw(decode encode); use POSIX qw(ceil); use Carp qw(croak); use vars qw(%parse_PIC); =head2 C<decode_COMP3> Decodes a BCD number with trailing sign. =cut sub decode_COMP3 { # 0C -> + # 0D -> - # 0F -> (unsigned) my $digits = unpack "H*", $_[0]; my $sign = chop $digits; #print "$digits\n"; if ($sign eq 'D' or $sign eq 'd') { $sign = '-' } elsif ($sign eq 'C' or $sign eq 'c') { $sign = '+' } elsif ($sign eq 'F' or $sign eq 'f') { $sign = ' ' } else { $digits .= $sign; $sign = '?' }; "$sign$digits" }; =head2 C<make_COMP3_decoder PRE, POST, SEP> Returns a subroutine that parses the digits into PRE places before and POST places after the decimal delimiter and returns a string with SEP as the decimal delimiter =cut sub make_COMP3_decoder { my ($pre,$post,$sep) = @_; $sep ||= ","; return sub { my $res = decode_COMP3($_[0]); substr($res,$pre+2,0) = $sep; $res }; }; =head2 C<decode_EBCDIC> Decodes an EBCDIC string to a Perl internal string Shorthand for decode( 'cp1047', $_[0] ) =cut sub decode_EBCDIC { decode( 'cp1047', $_[0] )}; # The patterns for things I recognize in the COBOL copy structures I e +ncounter %parse_PIC = ( qr/^PIC\s+S?9\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC +}, qr/^PIC\s+X\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC }, qr/^PIC\s+S?9\s*\((\d+)\)\s*COMP-3$/ => sub { return ceil(($1+1) / +2), \&decode_COMP3 }, qr/^PIC\s+S?9\s*\((\d+)\)V9\((\d+)\)\s*(?:USAGE\s+)?COMP-3$/ => su +b { return ceil(($1+$2+1)/2), make_COMP3_decoder($1,$2,$_[0]) }, qr/^PIC\s+S?9\s*\((\d+)\)V\s*COMP-3$/ => sub { return ceil(($1+1) +/2), \&decode_COMP3 }, qr/^PIC\s+S?9\s*\((\d+)\)\s+OCCURS\s+(\d+)$/ => sub { return $1*$2 +, \&decode_EBCDIC }, ); =head2 C<decode_PIC> Returns two values: =over 4 =item * Length in bytes of the PIC expression =item * Code reference that decodes the passed value according to the +copy structure =back =cut sub decode_PIC { my ($pic,$sep) = @_; for my $re (keys %parse_PIC) { if ($pic =~ /$re/) { return ($parse_PIC{$re}->($sep)); }; }; croak "Couldn't decode '$pic'"; }; =head2 C<PIC_length> Returns the length of the expression in bytes =cut sub PIC_length { my ($pic) = @_; my ($len,$decoder) = decode_PIC($pic); $len }; 1;

Replies are listed 'Best First'.
Re^2: EBCDIC and COBOL records
by plegall (Initiate) on Jul 11, 2008 at 11:26 UTC
    Your package Pic seems to match one of my need, but can you give a example of how to use it ? I've tried to execute decode_PIC on each line of my COBOL definition or on the whole file, none work :-/

      Wow - more than 10 years and somehow I missed your reply.

      Here is a usage, roughly transcribed from what would happen in the Real World:

      my $struct = <<'DEF'; * Some comment 05 AMOUNT VALUE PIC S9(5)V99. 05 VALUE-DATE VALUE PIC X(8). DEF my @lines = split /\n/, $struct; my $buffer = ''; for my $line (@lines) { next if $line =~ /^ \*/; # comment $buffer .= $line; next unless $buffer =~ /\./; # the line did not end, we need more $buffer =~ /^\s+(\d+)\s+(.+)/; or croak "Weirdo input found: $buffer"; my( $level, $info) = ($1,$2); if( $level == 1 ) { # A record or group, ignore } elsif ($info !~ /^(\S+)\s*(REDEFINES\s+(\S+))?\s*(PIC\s+[9XS].*? +\.$/) { croak "Weirdo info found: $info"; }; my $name = $1; my $redefine = $2; my $pic = $3; my $size = 0; my $decoder = sub { $_[0] }; if( $pic ) { ($size, $decoder) = PIC::decode_pic($pic,$decimal_separator,$e +ncoding); }; # ... use $size and $decoder to read and decode a number of bytes