http://qs321.pair.com?node_id=11118673


in reply to Re: Reliably parsing an integer
in thread Reliably parsing an integer

$DIFF = vec($A, $i, 8) - vec($B, $i, 8);

As I warned you over a year ago, vec on strings that happen to contain Unicode code points is now a fatal error, as of the newly released 5.32 it dies with "Use of strings with code points over 0xFF as arguments to vec is forbidden". Simply documenting "Illegal characters can mess up the result" is not robust. Sorry, but I've commented on it often enough: while you're free to code as you like, I can no longer recommend to anyone to use your "reinvented wheel" code.

Update: Added more links.

Update 2:

DISCLAIMER: ...

Please mark your updates as such.

Replies are listed 'Best First'.
Re^3: Reliably parsing an integer (updated)
by harangzsolt33 (Chaplain) on Jun 30, 2020 at 05:47 UTC
    Okay. I couldn't sleep until I corrected my error. This CMP sub works now!! Run the test and see it for yourself! Btw using vec() is not a mistake. If someone is trying to run UNICODE letters through this sub, then there's a serious error in the code, and it *should* fail. The programmer needs to test each string to make sure it contains nothing else but plain digits before trying to compare the two. Maybe I should include a line which converts a UNICODE string to plain ASCII string, but I don't know how to do that magic... :D

    #!/usr/bin/perl -w use strict; use warnings; print CMP("0", $b); print CMP("", $b); print CMP("0", ""); print CMP("", "000"); print CMP("", "55"); print CMP("111", "55"); print CMP("8,000,021", "7,999,999"); print CMP("003", "1"); print CMP("001", "2"); print CMP("003", "11"); print CMP("54", "45"); print CMP("123", "32"); print CMP("5", "5"); print CMP("1222225", "001222225"); print CMP(" 15", "15"); print CMP("0010", "100"); print CMP("C97F", "C97E"); print CMP("2E", "AE"); print CMP("00101 ", "00101"); exit; ################################################## # v2020.06.30 # Compares two large positive integers. # The integers can be binary (ones and zeros), # octal, decimal, or hexadecimal. # # NOTE: Both numbers must be in the same base. # You shouldn't try to compare a binary number such # as "10001101" to a hex number like "C4" # as this will give a bad result. # # Returns: 0 if the numbers are equal # 1 if the first one is greater # 2 if the second one is greater # # Special cases: # * When comparing an undefined value against # an empty string or zero, they will be equal. # * Minus signs are always ignored! # # Usage: INTEGER = CMP(STRING, STRING) # sub CMP { my $A = defined $_[0] ? uc($_[0]) : ''; my $B = defined $_[1] ? uc($_[1]) : ''; my $A2 = length($A); my $B2 = length($B); my ($A1, $B1, $CA, $CB, $DIFF) = (0, 0, 48, 48, 0); # SHOW WHAT'S HAPPENING: print "\n\nString1=|$A|\nString2=|$B| RET="; # Find the first significant digit or starting pointer for each stri +ng. # We will call this A1 and B1. In case the string starts with zeros, # spaces, tabs, new line characters, - and + signs, or other special # characters, we skip through those. We ignore them. while ($A1 < $A2 && vec($A, $A1, 8) < 49) { $A1++; } while ($B1 < $B2 && vec($B, $B1, 8) < 49) { $B1++; } # Find last significant digit or ending pointer for each string. # We will call this A2 and B2. while ($A2 > $A1 && vec($A, --$A2, 8) < 48) {} $A2++; while ($B2 > $B1 && vec($B, --$B2, 8) < 48) {} $B2++; # Calculate the number of digits in each number. my $AL = $A2 - $A1; my $BL = $B2 - $B1; # Are both numbers the same length? if ($AL == $BL) { # Compare from left to right, incrementing # pointers A1 and B1 as we walk through all the digits. while ($A1 < $A2) { $CA = vec($A, $A1++, 8); # Get digit from string A $CB = vec($B, $B1++, 8); # Get digit from string B $DIFF = $CA - $CB; if ($DIFF) { return $DIFF < 0 ? 2 : 1; } } return 0; } return 1 if ($AL > $BL); return 2 if ($AL < $BL); return 0; }
Re^3: Reliably parsing an integer (updated)
by harangzsolt33 (Chaplain) on Jun 29, 2020 at 20:12 UTC
    Oops.. Sorry!