http://qs321.pair.com?node_id=1092419


in reply to Re^2: converting txt to csv and formatting
in thread converting txt to csv and formatting

Generating valid CSV records is more trivial than parsing CSV records. In this case, all of your data are character strings, not numbers or timestamps, so it's appropriate to quote all three fields all the time.

"Jacobs, April","750.467.9582","quam.quis@sedhendrerit.org" "Mays, Martena","870.348.1974","sollicitudin@nonummyFusce.org" "McNeil, Brennan","289.527.6151","lobortis@nisl.com" "Sexton, Melvin ""The Copymeister""","599.927.5337","in.felis@vari +us.com" "Blackburn, Prescott","661.589.1228","sed@egetlaoreetposuere.edu"

The most important thing to anticipate when generating CSV records in which every field is quoted is the possibility of the presence of the quote character in the data. The most common convention nowadays for escaping literal occurrences of the quote character is to use the same character as an escape character ("").

for (@client_values) { s/(?=")/"/g; s/^/"/; s/$/"/; } my $client_record = join ',', @client_values;

Don't print the record piecemeal, one field at a time. Doing this is a worst practice, IMHO. Instead, generate a valid, whole CSV record and then print it.

print "$client_record\n";

Replies are listed 'Best First'.
Re^4: converting txt to csv and formatting
by csorrentini (Acolyte) on Jul 06, 2014 at 02:34 UTC
    This would be assuming that I can modify the text file? text file is not allowed to be editted, have to use it the way it is currently.
      This would be assuming that I can modify the text file?

      Uh… no. What makes you think that?

      This is just a snippet of the code you'd use within your line-reading while loop.

      BTW, you could do the chomping in the same for loop as the rest of the formatting.

      for (@client_values) { chomp; s/(?=")/"/g; s/^/"/; s/$/"/; }

      Alternatively…

      for (@client_values) { s/(?=")/"/g; s/^/"/; s/\n$/"/; }

      Update:  Maybe you were confused by my use of an array variable instead of three scalar variables. Let's assume you've read three lines (i.e., three client data values) into three variables as AppleFritter demonstrated here, but without chomping them as you read them.

      for ($client_name, $client_phone_number, $client_email_address) { s/(?=")/"/g; # Escape quote characters... s/^/"/; s/\n$/"/; # chomp and append quote character simultaneously +... } my $client_record = join ',', $client_name, $client_phone_number, $client_email_ +address; print "$client_record\n";

      Another update:  It occurs to me now that if you're new to Perl, then you're probably not familiar with the fancy regular expression pattern in this global substitution operation:

      s/(?=")/"/g;

      It's a look-ahead assertion, which is sort of an intermediate Perl topic. (See perldoc perlre.) If it's easier to understand, then just do this instead:

      s/"/""/g;
        Yes, sorry still VERRRY new to perl so alot of this is still foreign to me. I'll try to incorporate that and see how it goes. Thank you so much for the help i really appreciate it.