Toolic,
I was trying to use the split function to parse a file with records. I realized that was incorrect. I then looked at the record separator and attempted to implement it and then use the split function. I still had the same output. I used data dumper and I am able to see that only the first line is being parsed. Here is my code and example file:
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
# create scalar variable to define the file that will be
# parsed.
my $genpept = "/Users/mgavibrathwaite/Desktop/proteins.gp";
#Set the global record separator to "//"
open(my $in,"$genpept");
undef $/;
my @genpepts = split(/^\w{5}/,$in);
print Dumper(@genpepts);
__DATA__
LOCUS NP_644805 770 aa linear PRI 06
+-FEB-2011
DEFINITION signal transducer and activator of transcription 3 isoform
+ 1 [Homo
sapiens].
ACCESSION NP_644805
VERSION NP_644805.1 GI:21618340
DBSOURCE REFSEQ: accession NM_139276.2
KEYWORDS .
SOURCE Homo sapiens (human)
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele
+ostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhin
+i;
Catarrhini; Hominidae; Homo.
REFERENCE 1 (residues 1 to 770)
AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jia
+ng,L.,
Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A.
+ and
Xu,H.
CONSRTM Australo-Anglo-American Spondyloarthritis Consortium
TITLE Association of STAT3 and TNFRSF1A with ankylosing spondyli
+tis in
Han Chinese
JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011)
PUBMED 21068102
REMARK GeneRIF: Observational study of gene-disease association.
+(HuGE
Navigator)
REFERENCE 2 (residues 1 to 770)
AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R.,
Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E
+.,
Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr
+,S.G.,
Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D.
CONSRTM GenES Investigators
TITLE A targeted association study in systemic lupus erythematos
+us
identifies multiple susceptibility alleles
JOURNAL Genes Immun. 12 (1), 51-58 (2011)
PUBMED 20962850
REMARK GeneRIF: Observational study of gene-disease association.
+(HuGE
Navigator)
REFERENCE 3 (residues 1 to 770)
AUTHORS Hosur,V. and Loring,R.H.
TITLE alpha4beta2 nicotinic receptors partially mediate anti-inf
+lammatory
effects through Janus kinase 2-signal transducer and activ
+ator of
transcription 3 but not calcium or cAMP signaling
JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011)
PUBMED 20943775
REMARK GeneRIF: A role was determined for signal transducer and a
+ctivator
of transcription 3 and Janus kinase-2 transduction in alph
+a4beta2
nicotinic receptor-mediated anti-inflammatory effects.
REFERENCE 4 (residues 1 to 770)
AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., V
+ander
Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M.
TITLE Evidence for significant overlap between common risk varia
+nts for
Crohn's disease and ankylosing spondylitis
JOURNAL PLoS ONE 5 (11), E13795 (2010)
PUBMED 21072187
REMARK GeneRIF: Observational study of gene-disease association.
+(HuGE
Navigator)
Publication Status: Online-Only
REFERENCE 5 (residues 1 to 770)
AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A.,
Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A.
+C.
TITLE Aberrant expression and constitutive activation of STAT3 i
+n
cervical carcinogenesis: implications in high-risk human
papillomavirus infection
JOURNAL Mol. Cancer 9, 282 (2010)
PUBMED 20977777
REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is
aberrantly-expressed and constitutively-activated in cervi
+cal
cancer which increases as the lesion progresses thus indic
+ating its
potential role in progression of HPV16-mediated cervical
carcinogenesis.
Publication Status: Online-Only
REFERENCE 6 (residues 1 to 770)
AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N.
+ and
Yancopoulos,G.D.
TITLE STAT3 activation by cytokines utilizing gp130 and related
transducers involves a secondary modification requiring an
H7-sensitive kinase
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995)
PUBMED 7624343
REFERENCE 7 (residues 1 to 770)
AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee,
+J.A.,
Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al.
TITLE The role of shared receptor motifs and common Stat protein
+s in the
generation of cytokine pleiotropy and redundancy by IL-2,
+IL-4,
IL-7, IL-13, and IL-15
JOURNAL Immunity 2 (4), 331-339 (1995)
PUBMED 7719938
REFERENCE 8 (residues 1 to 770)
AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang,
+S.
TITLE Requirement of serine phosphorylation for formation of
STAT-promoter complexes
JOURNAL Science 267 (5206), 1990-1994 (1995)
PUBMED 7701321
REFERENCE 9 (residues 1 to 770)
AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusak
+a,T.,
Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T.
TITLE Molecular cloning of APRF, a novel IFN-stimulated gene fac
+tor 3
p91-related transcription factor involved in the gp130-med
+iated
signaling pathway
JOURNAL Cell 77 (1), 63-71 (1994)
PUBMED 7512451
REFERENCE 10 (residues 1 to 770)
AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H.
+ and
Calos,M.P.
TITLE Analysis of mutation in human cells by using an Epstein-Ba
+rr virus
shuttle system
JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987)
PUBMED 3031469
COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staf
+f. The
reference sequence was derived from BI461226.1, BC014482.1
+,
AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI63189
+6.1.
This sequence is a reference standard in the RefSeqGene pr
+oject.
On May 7, 2004 this sequence version replaced gi:16596688.
Summary: The protein encoded by this gene is a member of t
+he STAT
protein family. In response to cytokines and growth factor
+s, STAT
family members are phosphorylated by the receptor associat
+ed
kinases, and then form homo- or heterodimers that transloc
+ate to
the cell nucleus where they act as transcription activator
+s. This
protein is activated through phosphorylation in response t
+o various
cytokines and growth factors including IFNs, EGF, IL5, IL6
+, HGF,
LIF and BMP2. This protein mediates the expression of a va
+riety of
genes in response to cell stimuli, and thus plays a key ro
+le in
many cellular processes such as cell growth and apoptosis.
+ The
small GTPase Rac1 has been shown to bind and regulate the
+activity
of this protein. PIAS3 protein is a specific inhibitor of
+this
protein. Three alternatively spliced transcript variants e
+ncoding
distinct isoforms have been described. [provided by RefSeq
+].
Transcript Variant: This variant (1) represents the longes
+t
transcript, and encodes the longest isoform (1).
Publication Note: This RefSeq record includes a subset of
+ the
publications that are available for this gene. Please see
+the
Entrez Gene record to access additional publications.
FEATURES Location/Qualifiers
source 1..770
/organism="Homo sapiens"
/db_xref="taxon:9606"
/chromosome="17"
/map="17q21.31"
Protein 1..770
/product="signal transducer and activator of tran
+scription
3 isoform 1"
/note="acute-phase response factor; DNA-binding p
+rotein
APRF"
/calculated_mol_wt=87937
Region 150..162
/region_name="Essential for nuclear import"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="propagated from UniProtKB/Swiss-Prot (P407
+63.2)"
Site 539
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphotyrosine; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
Site 691
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphoserine; propagated from UniProtKB/S
+wiss-Prot
(P40763.2)"
Site 705
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphotyrosine; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
Site 714
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphothreonine; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
Site 727
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphoserine, by NLK; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
CDS 1..770
/gene="STAT3"
/gene_synonym="APRF; FLJ20882; HIES; MGC16063"
/coded_by="NM_139276.2:241..2553"
/note="isoform 1 is encoded by transcript variant
+ 1"
/db_xref="CCDS:CCDS32656.1"
/db_xref="GeneID:6774"
/db_xref="HGNC:11364"
/db_xref="HPRD:00026"
/db_xref="MIM:102582"
ORIGIN
1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatl
+vfhnl
61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrl
+lqtaa
121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfn
+yktlk
181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktl
+tdeel
241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykg
+dpivq
301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkf
+pelny
361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlre
+qrcgn
421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawa
+silwy
481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgp
+gvnys
541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kerer
+ailst
601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiim
+gykim
661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg saapylktkf icvtp
+ttcsn
721 tidlpmsprt ldslmqfgnn gegaepsagg qfesltfdme ltsecatspm
//
LOCUS NP_003141 769 aa linear PRI 06
+-FEB-2011
DEFINITION signal transducer and activator of transcription 3 isoform
+ 2 [Homo
sapiens].
ACCESSION NP_003141 NP_444275
VERSION NP_003141.2 GI:21618338
DBSOURCE REFSEQ: accession NM_003150.3
KEYWORDS .
SOURCE Homo sapiens (human)
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele
+ostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhin
+i;
Catarrhini; Hominidae; Homo.
REFERENCE 1 (residues 1 to 769)
AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jia
+ng,L.,
Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A.
+ and
Xu,H.
CONSRTM Australo-Anglo-American Spondyloarthritis Consortium
TITLE Association of STAT3 and TNFRSF1A with ankylosing spondyli
+tis in
Han Chinese
JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011)
PUBMED 21068102
REMARK GeneRIF: Observational study of gene-disease association.
+(HuGE
Navigator)
REFERENCE 2 (residues 1 to 769)
AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R.,
Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E
+.,
Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr
+,S.G.,
Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D.
CONSRTM GenES Investigators
TITLE A targeted association study in systemic lupus erythematos
+us
identifies multiple susceptibility alleles
JOURNAL Genes Immun. 12 (1), 51-58 (2011)
PUBMED 20962850
REMARK GeneRIF: Observational study of gene-disease association.
+(HuGE
Navigator)
REFERENCE 3 (residues 1 to 769)
AUTHORS Hosur,V. and Loring,R.H.
TITLE alpha4beta2 nicotinic receptors partially mediate anti-inf
+lammatory
effects through Janus kinase 2-signal transducer and activ
+ator of
transcription 3 but not calcium or cAMP signaling
JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011)
PUBMED 20943775
REMARK GeneRIF: A role was determined for signal transducer and a
+ctivator
of transcription 3 and Janus kinase-2 transduction in alph
+a4beta2
nicotinic receptor-mediated anti-inflammatory effects.
REFERENCE 4 (residues 1 to 769)
AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., V
+ander
Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M.
TITLE Evidence for significant overlap between common risk varia
+nts for
Crohn's disease and ankylosing spondylitis
JOURNAL PLoS ONE 5 (11), E13795 (2010)
PUBMED 21072187
REMARK GeneRIF: Observational study of gene-disease association.
+(HuGE
Navigator)
Publication Status: Online-Only
REFERENCE 5 (residues 1 to 769)
AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A.,
Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A.
+C.
TITLE Aberrant expression and constitutive activation of STAT3 i
+n
cervical carcinogenesis: implications in high-risk human
papillomavirus infection
JOURNAL Mol. Cancer 9, 282 (2010)
PUBMED 20977777
REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is
aberrantly-expressed and constitutively-activated in cervi
+cal
cancer which increases as the lesion progresses thus indic
+ating its
potential role in progression of HPV16-mediated cervical
carcinogenesis.
Publication Status: Online-Only
REFERENCE 6 (residues 1 to 769)
AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N.
+ and
Yancopoulos,G.D.
TITLE STAT3 activation by cytokines utilizing gp130 and related
transducers involves a secondary modification requiring an
H7-sensitive kinase
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995)
PUBMED 7624343
REFERENCE 7 (residues 1 to 769)
AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee,
+J.A.,
Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al.
TITLE The role of shared receptor motifs and common Stat protein
+s in the
generation of cytokine pleiotropy and redundancy by IL-2,
+IL-4,
IL-7, IL-13, and IL-15
JOURNAL Immunity 2 (4), 331-339 (1995)
PUBMED 7719938
REFERENCE 8 (residues 1 to 769)
AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang,
+S.
TITLE Requirement of serine phosphorylation for formation of
STAT-promoter complexes
JOURNAL Science 267 (5206), 1990-1994 (1995)
PUBMED 7701321
REFERENCE 9 (residues 1 to 769)
AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusak
+a,T.,
Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T.
TITLE Molecular cloning of APRF, a novel IFN-stimulated gene fac
+tor 3
p91-related transcription factor involved in the gp130-med
+iated
signaling pathway
JOURNAL Cell 77 (1), 63-71 (1994)
PUBMED 7512451
REFERENCE 10 (residues 1 to 769)
AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H.
+ and
Calos,M.P.
TITLE Analysis of mutation in human cells by using an Epstein-Ba
+rr virus
shuttle system
JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987)
PUBMED 3031469
COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staf
+f. The
reference sequence was derived from BI461226.1, BC000627.2
+,
AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI63189
+6.1.
On Jun 27, 2002 this sequence version replaced gi:4507253.
Summary: The protein encoded by this gene is a member of t
+he STAT
protein family. In response to cytokines and growth factor
+s, STAT
family members are phosphorylated by the receptor associat
+ed
kinases, and then form homo- or heterodimers that transloc
+ate to
the cell nucleus where they act as transcription activator
+s. This
protein is activated through phosphorylation in response t
+o various
cytokines and growth factors including IFNs, EGF, IL5, IL6
+, HGF,
LIF and BMP2. This protein mediates the expression of a va
+riety of
genes in response to cell stimuli, and thus plays a key ro
+le in
many cellular processes such as cell growth and apoptosis.
+ The
small GTPase Rac1 has been shown to bind and regulate the
+activity
of this protein. PIAS3 protein is a specific inhibitor of
+this
protein. Three alternatively spliced transcript variants e
+ncoding
distinct isoforms have been described. [provided by RefSeq
+].
Transcript Variant: This variant (2) lacks a segment in th
+e 5' UTR
and 3 nt within the CDS, as compared to variant 1. The res
+ulting
isoform (2) lacks an amino acid compared to isoform 1.
Publication Note: This RefSeq record includes a subset of
+ the
publications that are available for this gene. Please see
+the
Entrez Gene record to access additional publications.
FEATURES Location/Qualifiers
source 1..769
/organism="Homo sapiens"
/db_xref="taxon:9606"
/chromosome="17"
/map="17q21.31"
Protein 1..769
/product="signal transducer and activator of tran
+scription
3 isoform 2"
/note="acute-phase response factor; DNA-binding p
+rotein
APRF"
/calculated_mol_wt=87850
Region 150..162
/region_name="Essential for nuclear import"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="propagated from UniProtKB/Swiss-Prot (P407
+63.2)"
Site 539
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphotyrosine; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
Site 691
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphoserine; propagated from UniProtKB/S
+wiss-Prot
(P40763.2)"
Site 704
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphotyrosine; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
Site 713
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphothreonine; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
Site 726
/site_type="phosphorylation"
/experiment="experimental evidence, no additional
+ details
recorded"
/note="Phosphoserine, by NLK; propagated from
UniProtKB/Swiss-Prot (P40763.2)"
CDS 1..769
/gene="STAT3"
/gene_synonym="APRF; FLJ20882; HIES; MGC16063"
/coded_by="NM_003150.3:219..2528"
/note="isoform 2 is encoded by transcript variant
+ 2"
/db_xref="CCDS:CCDS32657.1"
/db_xref="GeneID:6774"
/db_xref="HGNC:11364"
/db_xref="HPRD:00026"
/db_xref="MIM:102582"
ORIGIN
1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatl
+vfhnl
61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrl
+lqtaa
121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfn
+yktlk
181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktl
+tdeel
241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykg
+dpivq
301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkf
+pelny
361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlre
+qrcgn
421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawa
+silwy
481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgp
+gvnys
541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kerer
+ailst
601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiim
+gykim
661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg aapylktkfi cvtpt
+tcsnt
721 idlpmsprtl dslmqfgnng egaepsaggq fesltfdmel tsecatspm
//
The desired contents of my array would be the individual records which are separated by LOCAL and //. Thanks for informing me to improve my posting.
Lom Space |