Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Replacing a pesky pair of quotes.

by Eimi Metamorphoumai (Deacon)
on Aug 11, 2005 at 20:19 UTC ( [id://483114]=note: print w/replies, xml ) Need Help??


in reply to Replacing a pesky pair of quotes.

What do you mean it "isn't working"? Are you sure your data really is what you think it is? Note that you don't need to \ the " inside the regexp, though it doesn't hurt. Here's test code that works for me.
#!/usr/bin/perl -lw use strict; use warnings; while(<DATA>){ chomp; my @data = split /\t/; my $ddrul = $data[7]; $ddrul =~ s/\"//g; print "'$ddrul'"; } __DATA__ EST-NY 234 5-Oct Springfield MA Springfield College +Townhouse Conference Room http://www.spfldcol.edu/home.nsf/welcome +/visit/directionsc EST-NY 923 18-Oct Salisbury MD Wor Wic Community Colleg +e http://www.worwic.edu/campus/directions.pdf EST-NY 886 19-Oct Frederick MD Hood College http +://www.hood.edu/welcome_to_hood/index.cfm?pid=_maps.htm#a1 SW 328 19-Oct Houston TX University of Houston - Clear +Lake "http://prtl.uhcl.edu/portal/page?_pageid=328,217631,328_ +217645&_dad=portal&_schema=PORTALP"
Prints:
'http://www.spfldcol.edu/home.nsf/welcome/visit/directionsc' 'http://www.worwic.edu/campus/directions.pdf' 'http://www.hood.edu/welcome_to_hood/index.cfm?pid=_maps.htm#a1' 'http://prtl.uhcl.edu/portal/page?_pageid=328,217631,328_217645&_dad=p +ortal&_schema=PORTALP'

Replies are listed 'Best First'.
Re^2: Replacing a pesky pair of quotes.
by hmbscully (Scribe) on Aug 11, 2005 at 20:32 UTC
    I don't know what else I can think my data is. The example data I supplied is what it is.

    The crux of my code is

    open(INFILE,"$ew_sites_file") || die "Cant open $ew_sites_file for + reading $!\n"; while($line = <INFILE>) { chomp $line; ($unused, $site, $date, $city, $state, $facility, $unused, $dd +url) = split(/\t/,$line); #tab-delimited file $ddrul =~ s/"//g; $ddurl =~ s/\r//; #lose that bad newline break #build the registration site lookup flatfile : state and javas +cript URL open(OUTFILE,">>$ew_regist_parse_file") || die "cant open $ew_ +regist_parse_file, $!\n"; print OUTFILE "$state|$start_tag$site$inbetween$city$inbetween +$state$inbetween$date$inbetween$ddurl$semicolon$city, $state : $date$ +endtag\n"; close OUTFILE; #build the locations flatfile open(OUT2FILE,">>$ew_locate_parse_file") || die "cant open $ew +_locate_parse_file, $!\n"; print OUT2FILE "$state|$city|$facility|$date|$ddurl\n"; close OUT2FILE; } close INFILE;

    The warnings say that the $ddrul =~ s/"//g; is using an initalized value, which I don't understand because it is. The output is two files, an example of one (the simpler one) is:

    TX|Austin|University of Texas at Austin|11-Oct| http://www.utexas.edu/ +cee/tcc/forms/tcclargemap.pdf TX|Houston|University of Houston - Clear Lake|19-Oct|"http://prtl.uhcl +.edu/portal/page?_pageid=328,217631,328_217645&_dad=portal&_schema=PO +RTALP" TX|El Paso|University of Texas - El Paso|25-Oct| http://www.utep.edu/s +earch/campusmaplarge.html
    Still with the extra quotes.
      a use strict would have caught the mispelled variable "ddrul" as opposed to what it should be - "ddurl"
        THANK YOU!
        I knew it was something obvious.

        As for use strict;, I am very aware that I don't use it. I have a very old perl install and the unix admin has something horked so that invoking strict means the script will never work, no matter how simple or how correctly coded it is.

        But tomorrow, I get a new shiny linux box with tasty up-to-date perl and my excuses go away!
      I'd probably have written it this way.
      while (<INFILE>) { my (undef, $site, $date, $city, $state, $facility, undef, $ddurl) += /\G"?([^\t]*?)"?(?:\t|[\r\n]+$)/g; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://483114]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (9)
As of 2024-04-23 10:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found