Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Validating ISBN numbers?

by gant (Initiate)
on Dec 17, 2002 at 17:34 UTC ( [id://220590]=perlquestion: print w/replies, xml ) Need Help??

gant has asked for the wisdom of the Perl Monks concerning the following question:

Hows it goin, i'm goin a project where i have to use validation rules..which is fair enough but i am havin problems with validating aN ISBN. This has to be:
The ten-digit number is divided into four parts of variable length, which must be separated clearly by hyphens or spaces:

ISBN 0 571 08989 5

or

ISBN 90-70002-34-5

Note: Experience suggests that hyphens are preferable to spaces.

The number of digits in the first three parts of the ISBN (group identifier, publisher prefix, title identifier) varies. The number of digits in the group number and in the publisher prefix is determined by the quantity of titles planned to be produced by the publisher or publisher group. Publishers or publisher groups with large title outputs are represented by fewer digits.

I would be grateful for any help.
cheers
gant

update (broquaint): title change (was CGI -Perl problem) and added formatting

Replies are listed 'Best First'.
Re: CGI -Perl problem
by Fletch (Bishop) on Dec 17, 2002 at 17:51 UTC

    Perhaps if you actually showed what code you've written so far . . .

    Or just use Business::ISBN that's already written (either directly or as inspiration).

Re: CGI -Perl problem
by Mr. Muskrat (Canon) on Dec 17, 2002 at 17:53 UTC

    Of the five ISBN modules that I could find on CPAN, the most relevant for your "CGI-Perl problem" is probably CGI::Untaint::isbn. It will untaint and validate an ISBN passed as a CGI parameter.

    Other (less) relevant modules are: Test::ISBN and Business::ISBN.

Re: CGI -Perl problem
by Aristotle (Chancellor) on Dec 17, 2002 at 17:48 UTC

    So exactly what is your problem?

    You can split /[- ]/, $isbn;, but where you go from there is a good question.

    I know that the last group with the single digit is a checksum in ISBN - you should be able to find information on how it's calculated and use that to validate the rest.

    Makeshifts last the longest.

Re: CGI -Perl problem
by jdporter (Paladin) on Dec 17, 2002 at 17:52 UTC
    I think I'd recommend a "destructive test" on a copy of the string. Perhaps something like this:
    sub assert_valid_isbn { local($_) = @_; s/^ISBN\s*// or die "'$_' does not begin with 'ISBN'"; y/- 0-9//cd and die "'$_[0]' contains invalid characters"; /-/ && / / and die "'$_[0]' contains both hyphens and spaces"; y/-/ /; # convert hyphens to spaces. my @g = split; @g == 4 or die "'$_[0]' is broken into the wrong number of groups" +; join('',@g) == 10 or die "'$_[0]' contains the wrong number of dig +its."; }
    This doesn't return anything, but throws an exception for invalid values. You could use it like this:
    for my $isbn ( @isbns ) { eval { assert_valid_isbn( $isbn ); process_this_isbn( $isbn ); }; $@ and warn $@; };

    jdporter
    ...porque es dificil estar guapo y blanco.

      my @g = split;

      Careful there - you just allowed stuff like this invalid ISBM through - ISBN 0--06----096975-1. Perhaps you should use:

      my @g = split / /;

      btw, this is valid: ISBN 0-06-096975-X (note the 'X').

      -- Dan

Re: CGI -Perl problem
by thatguy (Parson) on Dec 17, 2002 at 18:08 UTC
    It depends on what your requirements for validation are. If you only want a ten digit number,
    my $data_in="ISBN 90-70002-34-5"; $data_in=~ s/^ISBN//; $data_in=~ s/ /-/g; $data_in=~ s/-//g; my @isbn=split('', $data_in); my $count=scalar @isbn; unless (($count eq 10) && (!($count=~ m/[a-z]/i))){ warn "not enough digits in ISBN: $count instead of 10\n"; }
    Now, if you are checking against a database to validate, it depends on the format of the ISBN that is stored. If it's space or hyphen delimited you can:
    # data from the database you are checking against my $data_in="90 70002 34 5"; # inputed data my $check_data="ISBN 90-70002-34-5"; $check_data=~ s/^ISBN//; $check_data=~ s/-/ /g; $check_data=~ s/^ //; unless (($data_in eq $check_data) && (!($count=~ m/[a-z]/i))){{ die "failure: data in does not match db record\n"; }
    Of course it all depends on how you want to validate. If you have access to a database of ISBNs and you can validate GroupID, Publisher prefix, and so on then validate against it. Other wise, the best you can hope for is that it's just 10 digits.

    -phill

    Ahh the power of PerlMonks.. I stopped to grab some animal cookies and suddenly there are seven replies

Re: CGI -Perl problem
by thinker (Parson) on Dec 17, 2002 at 17:49 UTC
    Hi gant,

    I think the following should do it.
    /ISBN \d+[ -]+\d+[ -]+\d+[ -]\d/;

    Hope this helps

    thinker
    update I have realised that this is not a regex type question. I realise now that ISBN has it's own validating rules. Sorry. :-)
      That will match a valid ISBN along with many invalid ones. For a start, you're using + where you should be using things like {4} or {4,5}.

      Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://220590]
Approved by rozallin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2024-03-28 15:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found