Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Toolic,
I was trying to use the split function to parse a file with records. I realized
that was incorrect. I then looked at the record separator and attempted
to implement it and then use the split function.
I still had the same output. I used data dumper and I am able to see that only
the first line is being parsed. Here is my code and example file:
#!/usr/bin/perl -w use strict; use Data::Dumper; # create scalar variable to define the file that will be # parsed. my $genpept = "/Users/mgavibrathwaite/Desktop/proteins.gp"; #Set the global record separator to "//" open(my $in,"$genpept"); undef $/; my @genpepts = split(/^\w{5}/,$in); print Dumper(@genpepts); __DATA__ LOCUS NP_644805 770 aa linear PRI 06 +-FEB-2011 DEFINITION signal transducer and activator of transcription 3 isoform + 1 [Homo sapiens]. ACCESSION NP_644805 VERSION NP_644805.1 GI:21618340 DBSOURCE REFSEQ: accession NM_139276.2 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele +ostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhin +i; Catarrhini; Hominidae; Homo. REFERENCE 1 (residues 1 to 770) AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jia +ng,L., Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A. + and Xu,H. CONSRTM Australo-Anglo-American Spondyloarthritis Consortium TITLE Association of STAT3 and TNFRSF1A with ankylosing spondyli +tis in Han Chinese JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011) PUBMED 21068102 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 2 (residues 1 to 770) AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R., Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E +., Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr +,S.G., Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D. CONSRTM GenES Investigators TITLE A targeted association study in systemic lupus erythematos +us identifies multiple susceptibility alleles JOURNAL Genes Immun. 12 (1), 51-58 (2011) PUBMED 20962850 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 3 (residues 1 to 770) AUTHORS Hosur,V. and Loring,R.H. TITLE alpha4beta2 nicotinic receptors partially mediate anti-inf +lammatory effects through Janus kinase 2-signal transducer and activ +ator of transcription 3 but not calcium or cAMP signaling JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011) PUBMED 20943775 REMARK GeneRIF: A role was determined for signal transducer and a +ctivator of transcription 3 and Janus kinase-2 transduction in alph +a4beta2 nicotinic receptor-mediated anti-inflammatory effects. REFERENCE 4 (residues 1 to 770) AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., V +ander Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M. TITLE Evidence for significant overlap between common risk varia +nts for Crohn's disease and ankylosing spondylitis JOURNAL PLoS ONE 5 (11), E13795 (2010) PUBMED 21072187 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) Publication Status: Online-Only REFERENCE 5 (residues 1 to 770) AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A., Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A. +C. TITLE Aberrant expression and constitutive activation of STAT3 i +n cervical carcinogenesis: implications in high-risk human papillomavirus infection JOURNAL Mol. Cancer 9, 282 (2010) PUBMED 20977777 REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is aberrantly-expressed and constitutively-activated in cervi +cal cancer which increases as the lesion progresses thus indic +ating its potential role in progression of HPV16-mediated cervical carcinogenesis. Publication Status: Online-Only REFERENCE 6 (residues 1 to 770) AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N. + and Yancopoulos,G.D. TITLE STAT3 activation by cytokines utilizing gp130 and related transducers involves a secondary modification requiring an H7-sensitive kinase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995) PUBMED 7624343 REFERENCE 7 (residues 1 to 770) AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee, +J.A., Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al. TITLE The role of shared receptor motifs and common Stat protein +s in the generation of cytokine pleiotropy and redundancy by IL-2, +IL-4, IL-7, IL-13, and IL-15 JOURNAL Immunity 2 (4), 331-339 (1995) PUBMED 7719938 REFERENCE 8 (residues 1 to 770) AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang, +S. TITLE Requirement of serine phosphorylation for formation of STAT-promoter complexes JOURNAL Science 267 (5206), 1990-1994 (1995) PUBMED 7701321 REFERENCE 9 (residues 1 to 770) AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusak +a,T., Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T. TITLE Molecular cloning of APRF, a novel IFN-stimulated gene fac +tor 3 p91-related transcription factor involved in the gp130-med +iated signaling pathway JOURNAL Cell 77 (1), 63-71 (1994) PUBMED 7512451 REFERENCE 10 (residues 1 to 770) AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H. + and Calos,M.P. TITLE Analysis of mutation in human cells by using an Epstein-Ba +rr virus shuttle system JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987) PUBMED 3031469 COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staf +f. The reference sequence was derived from BI461226.1, BC014482.1 +, AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI63189 +6.1. This sequence is a reference standard in the RefSeqGene pr +oject. On May 7, 2004 this sequence version replaced gi:16596688. Summary: The protein encoded by this gene is a member of t +he STAT protein family. In response to cytokines and growth factor +s, STAT family members are phosphorylated by the receptor associat +ed kinases, and then form homo- or heterodimers that transloc +ate to the cell nucleus where they act as transcription activator +s. This protein is activated through phosphorylation in response t +o various cytokines and growth factors including IFNs, EGF, IL5, IL6 +, HGF, LIF and BMP2. This protein mediates the expression of a va +riety of genes in response to cell stimuli, and thus plays a key ro +le in many cellular processes such as cell growth and apoptosis. + The small GTPase Rac1 has been shown to bind and regulate the +activity of this protein. PIAS3 protein is a specific inhibitor of +this protein. Three alternatively spliced transcript variants e +ncoding distinct isoforms have been described. [provided by RefSeq +]. Transcript Variant: This variant (1) represents the longes +t transcript, and encodes the longest isoform (1). Publication Note: This RefSeq record includes a subset of + the publications that are available for this gene. Please see +the Entrez Gene record to access additional publications. FEATURES Location/Qualifiers source 1..770 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="17" /map="17q21.31" Protein 1..770 /product="signal transducer and activator of tran +scription 3 isoform 1" /note="acute-phase response factor; DNA-binding p +rotein APRF" /calculated_mol_wt=87937 Region 150..162 /region_name="Essential for nuclear import" /experiment="experimental evidence, no additional + details recorded" /note="propagated from UniProtKB/Swiss-Prot (P407 +63.2)" Site 539 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 691 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine; propagated from UniProtKB/S +wiss-Prot (P40763.2)" Site 705 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 714 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphothreonine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 727 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine, by NLK; propagated from UniProtKB/Swiss-Prot (P40763.2)" CDS 1..770 /gene="STAT3" /gene_synonym="APRF; FLJ20882; HIES; MGC16063" /coded_by="NM_139276.2:241..2553" /note="isoform 1 is encoded by transcript variant + 1" /db_xref="CCDS:CCDS32656.1" /db_xref="GeneID:6774" /db_xref="HGNC:11364" /db_xref="HPRD:00026" /db_xref="MIM:102582" ORIGIN 1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatl +vfhnl 61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrl +lqtaa 121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfn +yktlk 181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktl +tdeel 241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykg +dpivq 301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkf +pelny 361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlre +qrcgn 421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawa +silwy 481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgp +gvnys 541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kerer +ailst 601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiim +gykim 661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg saapylktkf icvtp +ttcsn 721 tidlpmsprt ldslmqfgnn gegaepsagg qfesltfdme ltsecatspm // LOCUS NP_003141 769 aa linear PRI 06 +-FEB-2011 DEFINITION signal transducer and activator of transcription 3 isoform + 2 [Homo sapiens]. ACCESSION NP_003141 NP_444275 VERSION NP_003141.2 GI:21618338 DBSOURCE REFSEQ: accession NM_003150.3 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele +ostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhin +i; Catarrhini; Hominidae; Homo. REFERENCE 1 (residues 1 to 769) AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jia +ng,L., Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A. + and Xu,H. CONSRTM Australo-Anglo-American Spondyloarthritis Consortium TITLE Association of STAT3 and TNFRSF1A with ankylosing spondyli +tis in Han Chinese JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011) PUBMED 21068102 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 2 (residues 1 to 769) AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R., Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E +., Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr +,S.G., Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D. CONSRTM GenES Investigators TITLE A targeted association study in systemic lupus erythematos +us identifies multiple susceptibility alleles JOURNAL Genes Immun. 12 (1), 51-58 (2011) PUBMED 20962850 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 3 (residues 1 to 769) AUTHORS Hosur,V. and Loring,R.H. TITLE alpha4beta2 nicotinic receptors partially mediate anti-inf +lammatory effects through Janus kinase 2-signal transducer and activ +ator of transcription 3 but not calcium or cAMP signaling JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011) PUBMED 20943775 REMARK GeneRIF: A role was determined for signal transducer and a +ctivator of transcription 3 and Janus kinase-2 transduction in alph +a4beta2 nicotinic receptor-mediated anti-inflammatory effects. REFERENCE 4 (residues 1 to 769) AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., V +ander Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M. TITLE Evidence for significant overlap between common risk varia +nts for Crohn's disease and ankylosing spondylitis JOURNAL PLoS ONE 5 (11), E13795 (2010) PUBMED 21072187 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) Publication Status: Online-Only REFERENCE 5 (residues 1 to 769) AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A., Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A. +C. TITLE Aberrant expression and constitutive activation of STAT3 i +n cervical carcinogenesis: implications in high-risk human papillomavirus infection JOURNAL Mol. Cancer 9, 282 (2010) PUBMED 20977777 REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is aberrantly-expressed and constitutively-activated in cervi +cal cancer which increases as the lesion progresses thus indic +ating its potential role in progression of HPV16-mediated cervical carcinogenesis. Publication Status: Online-Only REFERENCE 6 (residues 1 to 769) AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N. + and Yancopoulos,G.D. TITLE STAT3 activation by cytokines utilizing gp130 and related transducers involves a secondary modification requiring an H7-sensitive kinase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995) PUBMED 7624343 REFERENCE 7 (residues 1 to 769) AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee, +J.A., Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al. TITLE The role of shared receptor motifs and common Stat protein +s in the generation of cytokine pleiotropy and redundancy by IL-2, +IL-4, IL-7, IL-13, and IL-15 JOURNAL Immunity 2 (4), 331-339 (1995) PUBMED 7719938 REFERENCE 8 (residues 1 to 769) AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang, +S. TITLE Requirement of serine phosphorylation for formation of STAT-promoter complexes JOURNAL Science 267 (5206), 1990-1994 (1995) PUBMED 7701321 REFERENCE 9 (residues 1 to 769) AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusak +a,T., Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T. TITLE Molecular cloning of APRF, a novel IFN-stimulated gene fac +tor 3 p91-related transcription factor involved in the gp130-med +iated signaling pathway JOURNAL Cell 77 (1), 63-71 (1994) PUBMED 7512451 REFERENCE 10 (residues 1 to 769) AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H. + and Calos,M.P. TITLE Analysis of mutation in human cells by using an Epstein-Ba +rr virus shuttle system JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987) PUBMED 3031469 COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staf +f. The reference sequence was derived from BI461226.1, BC000627.2 +, AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI63189 +6.1. On Jun 27, 2002 this sequence version replaced gi:4507253. Summary: The protein encoded by this gene is a member of t +he STAT protein family. In response to cytokines and growth factor +s, STAT family members are phosphorylated by the receptor associat +ed kinases, and then form homo- or heterodimers that transloc +ate to the cell nucleus where they act as transcription activator +s. This protein is activated through phosphorylation in response t +o various cytokines and growth factors including IFNs, EGF, IL5, IL6 +, HGF, LIF and BMP2. This protein mediates the expression of a va +riety of genes in response to cell stimuli, and thus plays a key ro +le in many cellular processes such as cell growth and apoptosis. + The small GTPase Rac1 has been shown to bind and regulate the +activity of this protein. PIAS3 protein is a specific inhibitor of +this protein. Three alternatively spliced transcript variants e +ncoding distinct isoforms have been described. [provided by RefSeq +]. Transcript Variant: This variant (2) lacks a segment in th +e 5' UTR and 3 nt within the CDS, as compared to variant 1. The res +ulting isoform (2) lacks an amino acid compared to isoform 1. Publication Note: This RefSeq record includes a subset of + the publications that are available for this gene. Please see +the Entrez Gene record to access additional publications. FEATURES Location/Qualifiers source 1..769 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="17" /map="17q21.31" Protein 1..769 /product="signal transducer and activator of tran +scription 3 isoform 2" /note="acute-phase response factor; DNA-binding p +rotein APRF" /calculated_mol_wt=87850 Region 150..162 /region_name="Essential for nuclear import" /experiment="experimental evidence, no additional + details recorded" /note="propagated from UniProtKB/Swiss-Prot (P407 +63.2)" Site 539 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 691 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine; propagated from UniProtKB/S +wiss-Prot (P40763.2)" Site 704 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 713 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphothreonine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 726 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine, by NLK; propagated from UniProtKB/Swiss-Prot (P40763.2)" CDS 1..769 /gene="STAT3" /gene_synonym="APRF; FLJ20882; HIES; MGC16063" /coded_by="NM_003150.3:219..2528" /note="isoform 2 is encoded by transcript variant + 2" /db_xref="CCDS:CCDS32657.1" /db_xref="GeneID:6774" /db_xref="HGNC:11364" /db_xref="HPRD:00026" /db_xref="MIM:102582" ORIGIN 1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatl +vfhnl 61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrl +lqtaa 121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfn +yktlk 181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktl +tdeel 241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykg +dpivq 301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkf +pelny 361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlre +qrcgn 421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawa +silwy 481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgp +gvnys 541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kerer +ailst 601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiim +gykim 661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg aapylktkfi cvtpt +tcsnt 721 idlpmsprtl dslmqfgnng egaepsaggq fesltfdmel tsecatspm //

The desired contents of my array would be the individual records which are
separated by LOCAL and //. Thanks for informing me to improve my posting.
Lom Space

In reply to New Title: Implementing the Record separtor and split to parse a file with records;Re^2: it should be simple enough... by lomSpace
in thread it should be simple enough... by lomSpace

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2024-04-23 11:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found