For the other readers, here is a snippet of the index:
Anna Karenina, by Lev Nikolaevica Tolstoi
+ 13214
[Language: Dutch]
Night Before Christmas & Other Popular Stories For Children, by Variou
+s 13213
The Wild Olive, by Basil King 13212
The Pearl, by Sophie Jewett 13211
El Comendador Mendoza, by Juan Valera 13210
[Subtitle: Obras Completas Tomo VII]
[Language: Spanish]
It seems that there is a good bit of structure here. Each new entry
starts on a new line. The title and author are separated by
/, by/. The ID is at the end of the first line of the entry. Combining these, a first stab at a regexp would be
$line =~ /^(\w.*?), by (.*?)\s+(\d+)$/;
$author = $1;
$title = $2;
$id = $3;