Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Text::xSV quote problem

by cfreak (Chaplain)
on Oct 03, 2006 at 17:59 UTC ( [id://576139]=perlquestion: print w/replies, xml ) Need Help??

cfreak has asked for the wisdom of the Perl Monks concerning the following question:

After a long hiatus into the dark and scary world of PHP, I have returned with a question.

I have a project involving the parsing of a large (30MB) tab separated text file. The code was an existing project that I'm new to, the original author used Text::xSV for the parsing, so I don't really have the option to use something else.

The problem is the fields are not quoted in the text file but some of the data contains quotes. What i need is a way for Text::xSV to simply ignore them. I know I can use filtering to change them to some other character and then change them back once they are separated, but I'm hoping there's a better way.

I can't include the actual line due to non-disclosure agreements but its basically somethng like this

00001     Widget     ACTIVE     widget maker's inc     15"x15"

Here is the code I'm using to parse, bascially the same as the example in the pod

my $csv = Text::xSV->new(sep=>"\t" ); $csv->open_file("xxxxx.csv"); $csv->read_header(); # Make the headers case insensitive foreach my $field ($csv->get_fields) { if (lc($field) ne $field) { $csv->alias($field, lc($field)); } } while( $csv->get_row() ) { # save to a db }
Thanks!

Replies are listed 'Best First'.
Re: Text::xSV quote problem
by tilly (Archbishop) on Oct 03, 2006 at 18:30 UTC
    If you have tab-delimited data, with unquoted fields, and it is NOT a csv-like format, then Text::xSV is doing nothing useful for you. Just use split.
    my $file = "xxxx.cvs"; open(my $fh, "<", $file) or die "Cannot read '$file': $!"; my $line = <$fh>; chomp($line); my @fields = split /\t/, lc($line), -1; while ($line = <$fh>) { chomp($line); my %row; @row{@fields} = split /\t/, $line, -1; # save to a db }

      I would, unfortunately the requirements are to use Text::xSV, lots of files in the project use it and they want to remain consistent. It is also possible that the format will change to something that will need it

      I have solved the problem by using a filter function. Its kind of ugly but it works.

        Hmmm ... I must find some proverb or pithy saying that covers this situation .... ignoring the advice of the module's [id://tilly|author] is probably not in your best interest. Maybe I'm being too harsh (or maybe it's the repetiviness of always saying this to my four year old) - but you should at least say thank you.

        -derby
Re: Text::xSV quote problem
by grep (Monsignor) on Oct 03, 2006 at 18:17 UTC
    I haven't fooled around with Text::xSV much (mainly Text::CSV_XS), but the docs for Text::xSV points out set_filter where you can pass an anon func that would handle your quoting issue.

    You did not explain exactly you're looking to do with these quotes but I would try '\' escaping the quotes first, to see if that works.



    grep
    One dead unjugged rabbit fish later

      Yeah but I need the quotes, I'd have to change the back later. Not a big deal but I was trying to avoid that.

      I tried escaping too ... it didn't make a difference.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://576139]
Approved by grep
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2024-04-19 05:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found