http://qs321.pair.com?node_id=11116442

dbarkho14 has asked for the wisdom of the Perl Monks concerning the following question:

I am converting IBM mainframe files and I have had a lot of luck converting binary fixed length files that have COMP-3 fields using the unpack ("H*",$_) setup, however I am having an issues with a file that has binary fields. Here is the input record layout:

01 EXTRACT-REC-IN.
03 KEY-DATA.
05 FIELD1 PIC 9(5) VALUE 0 BINARY.
05 FIELD2 PIC 9(4) VALUE 0 BINARY.
05 FIELD3 PIC 9(4) VALUE 0 BINARY.
05 FIELD4 PIC 9(8) VALUE 0 BINARY.
05 FIELD5 PIC 9(4) VALUE 0 BINARY.
03 DATA.
05 FIELD6 PIC S9(04) VALUE 0 COMP.
05 FILED7 PIC S9(15)V99 VALUE 0 COMP-3.
05 FIELD8 PIC S9(15)V99 VALUE 0 COMP-3.

How do I unpack these BINARY fields in the KEY-DATA Section? See my code below as I have tried to use I1 in unpack section but it gives me 2936078336 when I am expecting to see 425 for FIELD1 Any help is appreciated!

#! /usr/bin/perl -w @ARGV == 1 or die "usage: $0 in_filename out_filename\n"; my $in_filename = shift; #set infile to binary mode open INFILE, '<:raw', $in_filename or die "can't open $in_filename: $! +"; binmode(INFILE); #open OUTFILE, '>', $out_filename or die "can't open $out_filename: $! +"; # record length is 34 $/ = \34; #map input file to process integers while ( <INFILE> ) { my($f1) = unpack ("I1",$_); #format variables my $f1p = sprintf("%d", $f1); #write record to file print "$f1p\n"; } close INFILE;

Replies are listed 'Best First'.
Re: Perl Unpack Cobol Binary File and Fields
by Corion (Patriarch) on May 04, 2020 at 17:29 UTC

    Unless you show us (a hexdump of) your input data, it will be hard to give you meaningful advice.

    I haven't had to deal with BINARY type numbers, but I would assume that they are basically just bytes and don't need any special treatment at all? ord should be enough to convert from a byte to the number?

      Thanks so much for the reply! Yes my first time with binary fields, I did not think it would be difficult since I had found great examples with COMP -3 fields on this site but I haven't had much luck with mapping these yet

      Here is the hexdump of my input file...fixed length of 34 bytes. Please so let me know if my hexdump is not what you need. I am relatively new to hexdump so this might not be what you want.

      hexdump -c -n 34 sltywk_binary_new
      0000000 \0 \0 001 257 \a 344 \0 003 \0 \0 \0 s \0 6 002 267
      0000010 \0 \0 \0 \0 \0 \0 005 E 035 \0 \0 \0 \0 \0 \0 \0
      0000020 \0 \f
      0000022

        I would guess that your PIC(5) means four or five bytes, at least that matches up somewhat well with your data. If we assume the first four bytes, then unpack 'N', substr $str, 0,4 gives 431:

        #!perl use strict; use warnings; my $str = "\0\0\1\o{257}\a\o{344}\0\3\0\0\0s\06\o{002}\o{267}"; my $data1 = unpack "N", substr($str,0,4); print $data1;

        I don't know how to get from the 431 I get to the 425 you get, but maybe the other numbers give a better clue here?

        hexdump -c -n 34 sltywk_binary_new 0000000 \0 \0 001 257 \a 344 \0 003 \0 \0 \0 s \0 6 002 267
        My hexdump-fu probably is a bit rusty, but how would hexdump -c give bytes like 257, 344 or 267?

        I prefer hexdump -C anyway...

        Or better: which variant of the hexdump-tool is this?
Re: Perl Unpack Cobol Binary File and Fields
by soonix (Canon) on May 04, 2020 at 18:45 UTC
    From what I remember, BINARY (and its synonym COMP-4) would correspond to a signed integer, and 9(4) (i.e. 9999) would fit in 16 bits, so I would vote for 's', or 's>' (explicitly big endian) or 's<' (explicitly little endian). Or maybe, try the 'S' (unsigned) or 'l' (signed long) variants.
Re: Perl Unpack Cobol Binary File and Fields
by Anonymous Monk on May 04, 2020 at 21:45 UTC

    See: https://www.ibm.com/support/knowledgecenter/en/SS6SG3_4.2.0/com.ibm.entcobol.doc_4.2/PGandLR/concepts/cpari09.htm

    PACKED-DECIMAL and COMP-3 are synonyms. Packed-decimal items occupy 1 byte of storage for every two decimal digits you code in the PICTURE description, except that the rightmost byte contains only one digit and the sign. This format is most efficient when you code an odd number of digits in the PICTURE description, so that the leftmost byte is fully used. Packed-decimal items are handled as fixed-point numbers for arithmetic purposes.

    The digits are stored left-to-right as decimal numbers, two digits per byte. The rightmost nybble is a sign indicator ... unless SIGN IS SEPARATE.

    https://www.ibm.com/support/knowledgecenter/en/SS6SG3_4.2.0/com.ibm.entcobol.doc_4.2/PGandLR/ref/rpari25.htm

    Positive can be $C, $A, $E, $F; negative can be $D, $B. So the numeric value -(0)123 might be $000102030B.

      so it turns out mike is a cobol expert. things make a little more sense now. :eyeroll:

      A reply falls below the community's threshold of quality. You may see it by logging in.

      See: https://www.ibm.com/support/knowledgecenter/SS6SG3_5.2.0/com.ibm.cobol52.ent.doc/PGandLR/ref/rlddepic.html

      The V symbol used in a PICTURE clause indicates the position of an assumed decimal point. This is not present in the data. So, the value 123.45 with PIC 9(3)V99 might appear as $01020304050C. With a different picture, the identical data could be interpreted as 12345, 1.2345, 12.345 and so on.

Re: Perl Unpack Cobol Binary File and Fields
by Anonymous Monk on May 04, 2020 at 21:49 UTC
    For BINARY data, IBM uses big-endian byte order: the most-significant byte is stored first, not last as with Intel microprocessors. The format is 2's complement.
      BINARY, COMP, and COMP-4 are synonyms. Binary-format numbers occupy 2, 4, or 8 bytes of storage. If the PICTURE clause specifies that an item is signed, the leftmost bit is used as the operational sign.

      A binary number with a PICTURE description of four or fewer decimal digits occupies 2 bytes; five to nine decimal digits, 4 bytes; and 10 to 18 decimal digits, 8 bytes. Binary items with nine or more digits require more handling by the compiler.
        Full disclosure: "if the picture does not specify that the binary operand is "signed," COBOL will automagically disregard two's complement. (However, not the case here.)
A reply falls below the community's threshold of quality. You may see it by logging in.