#!/usr/bin/perl -w use strict; use Data::Dumper; # create scalar variable to define the file that will be # parsed. my $genpept = "/Users/mgavibrathwaite/Desktop/proteins.gp"; #Set the global record separator to "//" open(my $in,"$genpept"); undef $/; my @genpepts = split(/^\w{5}/,$in); print Dumper(@genpepts); __DATA__ LOCUS NP_644805 770 aa linear PRI 06-FEB-2011 DEFINITION signal transducer and activator of transcription 3 isoform 1 [Homo sapiens]. ACCESSION NP_644805 VERSION NP_644805.1 GI:21618340 DBSOURCE REFSEQ: accession NM_139276.2 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. REFERENCE 1 (residues 1 to 770) AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jiang,L., Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A. and Xu,H. CONSRTM Australo-Anglo-American Spondyloarthritis Consortium TITLE Association of STAT3 and TNFRSF1A with ankylosing spondylitis in Han Chinese JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011) PUBMED 21068102 REMARK GeneRIF: Observational study of gene-disease association. (HuGE Navigator) REFERENCE 2 (residues 1 to 770) AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R., Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E., Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr,S.G., Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D. CONSRTM GenES Investigators TITLE A targeted association study in systemic lupus erythematosus identifies multiple susceptibility alleles JOURNAL Genes Immun. 12 (1), 51-58 (2011) PUBMED 20962850 REMARK GeneRIF: Observational study of gene-disease association. (HuGE Navigator) REFERENCE 3 (residues 1 to 770) AUTHORS Hosur,V. and Loring,R.H. TITLE alpha4beta2 nicotinic receptors partially mediate anti-inflammatory effects through Janus kinase 2-signal transducer and activator of transcription 3 but not calcium or cAMP signaling JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011) PUBMED 20943775 REMARK GeneRIF: A role was determined for signal transducer and activator of transcription 3 and Janus kinase-2 transduction in alpha4beta2 nicotinic receptor-mediated anti-inflammatory effects. REFERENCE 4 (residues 1 to 770) AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., Vander Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M. TITLE Evidence for significant overlap between common risk variants for Crohn's disease and ankylosing spondylitis JOURNAL PLoS ONE 5 (11), E13795 (2010) PUBMED 21072187 REMARK GeneRIF: Observational study of gene-disease association. (HuGE Navigator) Publication Status: Online-Only REFERENCE 5 (residues 1 to 770) AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A., Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A.C. TITLE Aberrant expression and constitutive activation of STAT3 in cervical carcinogenesis: implications in high-risk human papillomavirus infection JOURNAL Mol. Cancer 9, 282 (2010) PUBMED 20977777 REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is aberrantly-expressed and constitutively-activated in cervical cancer which increases as the lesion progresses thus indicating its potential role in progression of HPV16-mediated cervical carcinogenesis. Publication Status: Online-Only REFERENCE 6 (residues 1 to 770) AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N. and Yancopoulos,G.D. TITLE STAT3 activation by cytokines utilizing gp130 and related transducers involves a secondary modification requiring an H7-sensitive kinase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995) PUBMED 7624343 REFERENCE 7 (residues 1 to 770) AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee,J.A., Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al. TITLE The role of shared receptor motifs and common Stat proteins in the generation of cytokine pleiotropy and redundancy by IL-2, IL-4, IL-7, IL-13, and IL-15 JOURNAL Immunity 2 (4), 331-339 (1995) PUBMED 7719938 REFERENCE 8 (residues 1 to 770) AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang,S. TITLE Requirement of serine phosphorylation for formation of STAT-promoter complexes JOURNAL Science 267 (5206), 1990-1994 (1995) PUBMED 7701321 REFERENCE 9 (residues 1 to 770) AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusaka,T., Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T. TITLE Molecular cloning of APRF, a novel IFN-stimulated gene factor 3 p91-related transcription factor involved in the gp130-mediated signaling pathway JOURNAL Cell 77 (1), 63-71 (1994) PUBMED 7512451 REFERENCE 10 (residues 1 to 770) AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H. and Calos,M.P. TITLE Analysis of mutation in human cells by using an Epstein-Barr virus shuttle system JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987) PUBMED 3031469 COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staff. The reference sequence was derived from BI461226.1, BC014482.1, AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI631896.1. This sequence is a reference standard in the RefSeqGene project. On May 7, 2004 this sequence version replaced gi:16596688. Summary: The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein is activated through phosphorylation in response to various cytokines and growth factors including IFNs, EGF, IL5, IL6, HGF, LIF and BMP2. This protein mediates the expression of a variety of genes in response to cell stimuli, and thus plays a key role in many cellular processes such as cell growth and apoptosis. The small GTPase Rac1 has been shown to bind and regulate the activity of this protein. PIAS3 protein is a specific inhibitor of this protein. Three alternatively spliced transcript variants encoding distinct isoforms have been described. [provided by RefSeq]. Transcript Variant: This variant (1) represents the longest transcript, and encodes the longest isoform (1). Publication Note: This RefSeq record includes a subset of the publications that are available for this gene. Please see the Entrez Gene record to access additional publications. FEATURES Location/Qualifiers source 1..770 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="17" /map="17q21.31" Protein 1..770 /product="signal transducer and activator of transcription 3 isoform 1" /note="acute-phase response factor; DNA-binding protein APRF" /calculated_mol_wt=87937 Region 150..162 /region_name="Essential for nuclear import" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 539 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 691 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 705 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 714 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 727 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine, by NLK; propagated from UniProtKB/Swiss-Prot (P40763.2)" CDS 1..770 /gene="STAT3" /gene_synonym="APRF; FLJ20882; HIES; MGC16063" /coded_by="NM_139276.2:241..2553" /note="isoform 1 is encoded by transcript variant 1" /db_xref="CCDS:CCDS32656.1" /db_xref="GeneID:6774" /db_xref="HGNC:11364" /db_xref="HPRD:00026" /db_xref="MIM:102582" ORIGIN 1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatlvfhnl 61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrllqtaa 121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfnyktlk 181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktltdeel 241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykgdpivq 301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkfpelny 361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlreqrcgn 421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawasilwy 481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgpgvnys 541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kererailst 601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiimgykim 661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg saapylktkf icvtpttcsn 721 tidlpmsprt ldslmqfgnn gegaepsagg qfesltfdme ltsecatspm // LOCUS NP_003141 769 aa linear PRI 06-FEB-2011 DEFINITION signal transducer and activator of transcription 3 isoform 2 [Homo sapiens]. ACCESSION NP_003141 NP_444275 VERSION NP_003141.2 GI:21618338 DBSOURCE REFSEQ: accession NM_003150.3 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. REFERENCE 1 (residues 1 to 769) AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jiang,L., Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A. and Xu,H. CONSRTM Australo-Anglo-American Spondyloarthritis Consortium TITLE Association of STAT3 and TNFRSF1A with ankylosing spondylitis in Han Chinese JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011) PUBMED 21068102 REMARK GeneRIF: Observational study of gene-disease association. (HuGE Navigator) REFERENCE 2 (residues 1 to 769) AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R., Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E., Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr,S.G., Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D. CONSRTM GenES Investigators TITLE A targeted association study in systemic lupus erythematosus identifies multiple susceptibility alleles JOURNAL Genes Immun. 12 (1), 51-58 (2011) PUBMED 20962850 REMARK GeneRIF: Observational study of gene-disease association. (HuGE Navigator) REFERENCE 3 (residues 1 to 769) AUTHORS Hosur,V. and Loring,R.H. TITLE alpha4beta2 nicotinic receptors partially mediate anti-inflammatory effects through Janus kinase 2-signal transducer and activator of transcription 3 but not calcium or cAMP signaling JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011) PUBMED 20943775 REMARK GeneRIF: A role was determined for signal transducer and activator of transcription 3 and Janus kinase-2 transduction in alpha4beta2 nicotinic receptor-mediated anti-inflammatory effects. REFERENCE 4 (residues 1 to 769) AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., Vander Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M. TITLE Evidence for significant overlap between common risk variants for Crohn's disease and ankylosing spondylitis JOURNAL PLoS ONE 5 (11), E13795 (2010) PUBMED 21072187 REMARK GeneRIF: Observational study of gene-disease association. (HuGE Navigator) Publication Status: Online-Only REFERENCE 5 (residues 1 to 769) AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A., Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A.C. TITLE Aberrant expression and constitutive activation of STAT3 in cervical carcinogenesis: implications in high-risk human papillomavirus infection JOURNAL Mol. Cancer 9, 282 (2010) PUBMED 20977777 REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is aberrantly-expressed and constitutively-activated in cervical cancer which increases as the lesion progresses thus indicating its potential role in progression of HPV16-mediated cervical carcinogenesis. Publication Status: Online-Only REFERENCE 6 (residues 1 to 769) AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N. and Yancopoulos,G.D. TITLE STAT3 activation by cytokines utilizing gp130 and related transducers involves a secondary modification requiring an H7-sensitive kinase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995) PUBMED 7624343 REFERENCE 7 (residues 1 to 769) AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee,J.A., Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al. TITLE The role of shared receptor motifs and common Stat proteins in the generation of cytokine pleiotropy and redundancy by IL-2, IL-4, IL-7, IL-13, and IL-15 JOURNAL Immunity 2 (4), 331-339 (1995) PUBMED 7719938 REFERENCE 8 (residues 1 to 769) AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang,S. TITLE Requirement of serine phosphorylation for formation of STAT-promoter complexes JOURNAL Science 267 (5206), 1990-1994 (1995) PUBMED 7701321 REFERENCE 9 (residues 1 to 769) AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusaka,T., Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T. TITLE Molecular cloning of APRF, a novel IFN-stimulated gene factor 3 p91-related transcription factor involved in the gp130-mediated signaling pathway JOURNAL Cell 77 (1), 63-71 (1994) PUBMED 7512451 REFERENCE 10 (residues 1 to 769) AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H. and Calos,M.P. TITLE Analysis of mutation in human cells by using an Epstein-Barr virus shuttle system JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987) PUBMED 3031469 COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staff. The reference sequence was derived from BI461226.1, BC000627.2, AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI631896.1. On Jun 27, 2002 this sequence version replaced gi:4507253. Summary: The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein is activated through phosphorylation in response to various cytokines and growth factors including IFNs, EGF, IL5, IL6, HGF, LIF and BMP2. This protein mediates the expression of a variety of genes in response to cell stimuli, and thus plays a key role in many cellular processes such as cell growth and apoptosis. The small GTPase Rac1 has been shown to bind and regulate the activity of this protein. PIAS3 protein is a specific inhibitor of this protein. Three alternatively spliced transcript variants encoding distinct isoforms have been described. [provided by RefSeq]. Transcript Variant: This variant (2) lacks a segment in the 5' UTR and 3 nt within the CDS, as compared to variant 1. The resulting isoform (2) lacks an amino acid compared to isoform 1. Publication Note: This RefSeq record includes a subset of the publications that are available for this gene. Please see the Entrez Gene record to access additional publications. FEATURES Location/Qualifiers source 1..769 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="17" /map="17q21.31" Protein 1..769 /product="signal transducer and activator of transcription 3 isoform 2" /note="acute-phase response factor; DNA-binding protein APRF" /calculated_mol_wt=87850 Region 150..162 /region_name="Essential for nuclear import" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 539 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 691 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 704 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 713 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 726 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine, by NLK; propagated from UniProtKB/Swiss-Prot (P40763.2)" CDS 1..769 /gene="STAT3" /gene_synonym="APRF; FLJ20882; HIES; MGC16063" /coded_by="NM_003150.3:219..2528" /note="isoform 2 is encoded by transcript variant 2" /db_xref="CCDS:CCDS32657.1" /db_xref="GeneID:6774" /db_xref="HGNC:11364" /db_xref="HPRD:00026" /db_xref="MIM:102582" ORIGIN 1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatlvfhnl 61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrllqtaa 121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfnyktlk 181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktltdeel 241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykgdpivq 301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkfpelny 361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlreqrcgn 421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawasilwy 481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgpgvnys 541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kererailst 601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiimgykim 661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg aapylktkfi cvtpttcsnt 721 idlpmsprtl dslmqfgnng egaepsaggq fesltfdmel tsecatspm //