Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

US Library of Congress perl module

by eg (Friar)
on Dec 10, 2000 at 13:24 UTC ( #45931=sourcecode: print w/replies, xml ) Need Help??
Category: Web Stuff
Author/Contact Info eg

A perl module to access the library of congress' book database.

Unlike Amazon or Barnes and Noble, you can't just look up a book in the Library of Congress database with an ISBN; you need to first initialize a session with their web server. That's all this module does, it initializes a session for you and returns either a url of a book's web page or a reference to a hash containing that book's data (author, title, etc.)

package LOC;

use strict;
use CGI;  
use IO::Socket;

sub z3950_data {
        my $isbn = shift();

        local $/ = undef;
        my $ua = IO::Socket::INET->new(
                Proto    => 'tcp',
                PeerAddr => '',
                PeerPort => 'http(80)', );

        print $ua "GET " . z3950_url($isbn) . "\x0d\x0a\x0d\x0a";

        my ($raw_data) = <$ua> =~ /<pre>(.*?)<\/pre>/is;
        return undef unless ( defined($raw_data) );

        my %data = ();
        my $last = 'UNKNOWN';
        foreach my $line ( split(/\n/, $raw_data ) ) {
                chomp( $line );
                $line =~ s/\s+/ /g;
                if ( my ($key, $value) = $line =~ /^([^:]+): (.*)/ ) {
                   $data{$key} .= $value;
                   $last = $key;
                else {
                   $data{$last} .= $line;

        return \%data;

sub z3950_html {
        my $isbn = shift();

        my $data = z3950_data( $isbn );
        if ( !defined($data) ) {
                return "<font color='#990000'>z3950_html: Can't get LO
+C data</font>";

        return "<pre>", join("\n", map { "$_ -> $$data{$_}" } keys( %$
+data )),

sub z3950_url {
        my $isbn = shift();
        my $sid = undef;

        local $/ = undef;
        my $ua = IO::Socket::INET->new(
                Proto    => 'tcp',
                PeerAddr => '',
                PeerPort => 'http(80)', );

        print $ua "GET /cgi-bin/zgate?ACTION=INIT\&FORM_HOST_PORT=".

        ($sid) = <$ua> =~ /NAME="SESSION_ID"\s+VALUE="(\d+)"/i;

        return undef unless ( defined($sid) );

                # "MAXRECORDS=20&".
                # "RECSYNTAX=1.2.840.10003.5.10&".
                # "REINIT=" . CGI::escape("/cgi-bin/zgate?ACTION=INIT&
+. "&" .



=head1 TITLE - an interface to the Library of Congress' book database


To redirect from a web page:

        use CGI;
        use LOC;

        my $cgi = new CGI;
        my $isbn = $cgi->param('isbn');
        print $cgi->redirect( LOC::z3950_url($isbn) );

To get the data for a certain book:

        use LOC;

        my $data = LOC::z3950_data( $isbn );
        foreach my $key ( keys(%$data) ) {
                print "$key: $$data{$key}\n";


The Library of Congress' web-interface to their book database is screw
You just can't find the isbn and plug it into a simple url.  No, you
need to initialize a session first, and then plug the isbn into a simp
url.  Oh well.  So this module initializes a session and redirects you
to the right url.  Or, you can just grab the data from the LOC and pre
it in whatever form you want.


=over 4

=item \%hash z3950_data( $isbn ) 

Given an ISBN, return a reference to a hash with the data downloaded f
the Library of Congress.  The keys to the hash are the data field name

=item $html z3950_html( $isbn )

Dump out the data from z3950_data as HTML.  Sort of.  It's just plain
text with <pre> tags around it.

=item $url z3950_url( $isbn )

The url that will get the LOC page for this ISBN.


Replies are listed 'Best First'.
Re: US Library of Congress perl module
by Anonymous Monk on Nov 10, 2010 at 20:15 UTC
    Thank you for this! perl hasn't changed much in 10 years. This still works with two changes. First, the website "" is now "". Second, for the return in the html subroutine you need periods (.) instead of commas (,) to concatinate the join to the "pre" statements.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://45931]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2020-09-27 09:14 GMT
Find Nodes?
    Voting Booth?
    If at first I donít succeed, I Ö

    Results (142 votes). Check out past polls.