How to convert the NCBI Gene ID to GenBank ID?

supriyoch_2008 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perl Monks,

I am interested in converting the Gene ID of NCBI to GenBank ID. In NCBI Gene database, when I write 7157 as Gene ID in search box, the page opens with the heading "TP53 tumor protein p53 Homo sapiens (human)". Almost at the bottom of that page the sub-heading appears as "mRNA and Protein(s)" which shows the GenBank ID as "NM_000546.5" (first entry) with a hyperlink. When clicked, the GenBank page opens up and shows the details. This is a cumbersome process when one has to get the GenBank ID of many genes. I searched in the web for a perl script which can convert Gene ID to GenBank ID using internet directly. But I did not get such a script. However, the link http://biodb.jp/ can perform this task of conversion in a very lengthy procedure. Then, I tried to get the sequence of Gene ID 7157 using a script:

Here goes the script for sequence:

#!/usr/bin/perl
use warnings;
use strict; 

use Bio::DB::GenBank;
use Bio::SeqIO;  
use Text::Wrap; 

my $gb= new Bio::DB::GenBank; 

my $id='7157'; 

my $seq = $gb->get_Seq_by_gi($id); 

print "\n seq: $seq\n"; 

exit;
[download]

But I got the wrong result and not the sequence in cmd as follows: Here goes the result in cmd:

 C:\Users\x>cd d*

C:\Users\x\Desktop>g2.pl

 seq: Bio::Seq::RichSeq=HASH(0x780b234)

C:\Users\x\Desktop>
[download]

I need suggestions from PerlMonks to solve this problem of ID conversion so that I can get the results of Gene IDs: 7157, 7422 as follows in cmd:

I expect results in the following format:

 
GenBank ID
NM_000546.5
NM_001025366.2
[download]

Comment on How to convert the NCBI Gene ID to GenBank ID? Select or Download Code

Replies are listed 'Best First'.
Re: How to convert the NCBI Gene ID to GenBank ID? by bliako (Monsignor) on Jun 22, 2018 at 15:37 UTC
As per Bio::DB::GenBank's documentation the function you are calling `get_Seq_by_gi()` returns a Bio::Seq object. Whose documentation is here Bio::Seq. In there it shows how to print the sequence, for example, using a method like : `print $seq->seq()."\n";` That said, be warned that in your script you hammer in your '7157' ID to a remote service which expects GenBank's IDs (NM_000546.5). You do get a response which is garbage unfortunately as it relates to "Dictyostelium discoideum (Slime mold)" and not to "Homo Sapiens" - what's the difference one can ask (sidenote: more evidence that the old GIGO effect is deep-rooted into the heart of the lite "sciences"). In order to get the response you want you must supply the `get_Seq_by_id()` with this `$id='NM_000546.5'`. Then it remains to explore the documentation in order to extract what you want from that large (>100KB) dataset you just transfered from 4000km away across land and sea and possibly through aether. If your main aim is to convert programmatically the ID 7157 to an ID understood by GenBank, e.g. NM_000546.5 then welcome to the club of gene id conversions. Probably a tenth of the net's transfers are to sites claiming to convert between the numerous ID standards imposed by bio-narcissi and fund-whores and desperate users who somewhere got lost in these standards or found that the mapping is not 1-1. I can not help you with that although, additionally to Bioperl, R (bioconductor) may offer you another lifeline.	[reply] [d/l] [select]
Re^2: How to convert the NCBI Gene ID to GenBank ID? by Anonymous Monk on Jun 22, 2018 at 23:32 UTC
Thanks much bliako for mentioning the method to get the sequence from Bio::Seq. That thought occurred to be much later that the module may already have a(some) method(s) to do that; advice about Data::Dumper may turn out to be of no use.	[reply]
Re^3: How to convert the NCBI Gene ID to GenBank ID? by bliako (Monsignor) on Jun 23, 2018 at 09:44 UTC
Glad. If you hear of a method to do what you want then let me know and I can assist you with the low-level details if need be. I would like to put all these converters in one place one day. btw the returned object is of type Bio::Seq::RichSeq and not what I initially said, Bio:Seq. It is a superset of it, so-to-speak. I was citing the doc and overlooked the returned value you posted.	[reply]
Re^4: How to convert the NCBI Gene ID to GenBank ID? by Anonymous Monk on Jun 25, 2018 at 12:14 UTC
Re^3: How to convert the NCBI Gene ID to GenBank ID? by supriyoch_2008 (Monk) on Jun 23, 2018 at 17:49 UTC
Anonymous Monk, Thanks for your comments. With regards,	[reply]
Re^2: How to convert the NCBI Gene ID to GenBank ID? by supriyoch_2008 (Monk) on Jun 23, 2018 at 17:47 UTC
Hi Bliako, Thank you very much for your valuable comments and suggestions. I am sorry for late reply as the internet connectivity was not available this morning. With regards,	[reply]
Re: How to convert the NCBI Gene ID to GenBank ID? by Anonymous Monk on Jun 22, 2018 at 07:02 UTC
Well, what actually is there in that HASH reference? See Data::Dumper or similar module to see the details (if possible) underneath the reference.	[reply]


No such thing as a small change
	PerlMonks