Depending on how deeply you want to go into the programming, there are several perl approaches; the following depend on having bioperl installed.
And if you want to go the BioPerl route, make sure you use a recent version, like 1.5.2 (nevermind the 'developer version' comment on that page). For searching GenBank, just read the HOWTO's:
Bioperl Beginners HOWTO
Bioperl SeqIO HOWTO
An alternative may be BioMart (see esp. the martview button). BioMart lets you click together your query interactively, which in itself is already useful. But on top of that you can get the same query in xml form by clicking the XML button, and saving the query (not the data) in xml format. The retrieved query-xml can then be re-submitted programmatically, possibly after tweaking it. This makes it pretty easy to get started submitting queries to this specialized database.
The most control you get by using the ensembl perl API (see perl API installation page). This API lets you connect to the mysql servers at ebi (or you can download data & install it locally).
An ensembl perl API tutorial, scratching the surface deeply, is here: ensembl Core perl API tutorial.
Ensembl and BioMart have basically the same data (although there are some differences), with biomart being a denormalized version of the master data in ensembl.
Needless to say, there is a lot of annotation/cross-references/expertise at ebi, and this is built into these API's.
|