Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Modifying an existing Perl script to ask for input and output filenames and also remove double quotes for outfile file

by ei6eqb (Initiate)
on Jan 30, 2018 at 00:22 UTC ( #1208088=perlquestion: print w/replies, xml ) Need Help??

ei6eqb has asked for the wisdom of the Perl Monks concerning the following question:

Hi All Monks , Sorry I have already posted this question in error under Anonymous as I did not realise I was logged out, so here it is again

I have a daily task to do.

I trying to perform an automatic extraction of data from an input.csv file and save the processed data to output.csv file.

The input.csv file is created by TAS Books and is of a fixed field format which cannot be modified in any way in Tasbooks. The output.csv is then sent on to another site for further processing.

I tried to do what I required in OpenOffice, but found this had many stages to complete.

I have a Perl script which is easier to use which was written for me on a different site here;- ( https://forum.openoffice.org/en/forum/viewtopic.php?f=5&t=91255&p=432185&hilit=tony+fay&sid=091a3c41d51f30d2fb8570b53ab34218#p432185 0

This Perl script, for the most part does what I require. But there are two functions I would like to have included within the main inputfilescript.pl script file.

1. When I run the script on my input.csv file the output.csv file which is created by the script has double quotes in two of the data columns,
i.e. the Timestamp field "2018-01-23 00:00" and the ProdDescription field "Acme Widget Large 20 inch"
These double quotes need to be removed to comply with the required data format of the final csv file.

At present I remove these double quotation marks from the output.csv file by running the following Perl script in a batch file after the output.csv file is created.

PERL -pe "s/\x22//g" output.csv > output2.csv

( I found this here;- https://stackoverflow.com/questions/22674955/removing-double-quotes-from-a-csv-file )

This removes the double quotes from the output.csv file and saves the the file as output2.csv.

Note as I'm running my perl scripts in Windows10 and I cannot use the " character directly PERL -pe "s/\"//g" output.csv > output2.csv
so I use x22 = hex for double quotes character in the script.
I would like to have the double quote removal included within the main inputfilescript.pl file also

2. As the input.csv files are sent to me with different filenames each day, I am renaming these to input.csv for processing by the perl inputfilescript.pl script file. I would like to have the function of the inputfilescript.pl file asking me to type the filenames for the INPUT.csv file name and an OUTPUT.csv so that I do not have to rename filenames for processing.

==== Below are the files I'm Using ====

My Main Perl Script File inputfilescript.pl

use Text::CSV; my ($in, $out, $csvi, $csvo, $ar, $col, $hri, $hro, $yyyy, $mm, $dd, $ +code, $desc); my @hdr = qw(DeviceID PosTxnID MobileNumber Timestamp PosID PosUserID +ShopID Prodcode ProdDescription ProdDept ProdGroup1 ProdGroup2 Q +ty Value); my @e14 = ('','','','','','','','','','','','','',''); my $spacer = 0; # open $in, "<inputnoquotes.csv" or die "inputnoquotes.csv: $!\n"; open $in, "<input.csv" or die "input.csv: $!\n"; $csvi = Text::CSV->new( { binary=>1 } ); open $out, ">output.csv" or die "output.csv: $!\n"; $csvo = Text::CSV->new( { binary=>1, eol=>$/ } ); $ar = $csvi->getline($in); $col = 0; foreach (@{$ar}) { $_ = "Col$col" if $col>7; s/ //; $col++; } $csvi->column_names($ar); $csvo->column_names(@hdr); $csvo->print($out +,\@hdr); while ($hri = $csvi->getline_hr($in)) { if ($hri->{InvDate}) { ($dd,$mm,$yyyy) = split("/",$hri->{InvDate}); $hro = { DeviceID=>$hri->{CustCode}, PosTxnID=>$hri->{InvNum}, M +obileNumber=>'', Timestamp=>$yyyy.'-'.$mm.'-'.$dd.' 00:00', PosID=>'', P +osUserID=>'', ShopID=>'2345', Prodcode=>$code, ProdDescription=>$desc +, ProdDept=>'', ProdGroup1=>'', ProdGroup2=>'', Qty=>$hri->{Qty}, Value +=>$hri->{NetAmt} }; if ($spacer) { $csvo->print($out,\@e14); $spacer = 0; } $csvo->print_hr($out,$hro); } elsif ($hri->{CustCode}) { ($code,$desc) = split(' - # ',$hri->{Cus +tCode}); } elsif ($hri->{LineDisc}) { $spacer = 0; } } # original code had { $spacer = 1; } I changed this as my input +.csv file originally had a - but now contains - # between the Produc +t code and product description in the input.csv file

==================================================

Original input.csv file as outputted from TasBooks

"Cust Code","Inv Date","Inv Num","Type","Unit Price","Line Disc","Qty" +,"Net Amt","Cost Amt","Margin","%","Market Code","", "ZB101 - # Acme Widget Large 20 inch","","","","","","","","","","","" +,"", "VINT01","23/01/2018",112681,"I",23.50,"",1.00,23.50,"",23.50,"100.00% +","DEFAULT","", "RR01","23/01/2018",112683,"I",23.50,"",2.00,47.00,"",47.00,"100.00%", +"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",3.00,70.50,"",70.50,"100.00%","","", "","","","","","","","","","","","","", "ZB1201 - # Acme Widget Small 5 inch","","","","","","","","","","","" +,"", "VINT01","23/01/2018",112681,"I",50.00,"",2.00,100.00,"",100.00,"100.0 +0%","DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",2.00,100.00,"",100.00,"100.00%","","" +, "","","","","","","","","","","","","", "ZB2101 - # Acme Widget Large 15 inch","","","","","","","","","",""," +","", "PIP01","23/01/2018",112670,"I",12.50,"",1.00,12.50,"",12.50,"100.00%" +,"DEFAULT","", "VINT01","23/01/2018",112681,"I",12.50,"",1.00,12.50,"",12.50,"100.00% +","DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",2.00,25.00,"",25.00,"100.00%","","", "","","","","","","","","","","","","", "ZB2401 - # Acme Widget Medium 10 inch","","","","","","","","","","", +"","", "VCR01","23/01/2018",112674,"I",32.50,"",2.00,65.00,"",65.00,"100.00%" +,"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",2.00,65.00,"",65.00,"100.00%","","", "","","","","","","","","","","","","", "ZB2501 - # Acme Widget Small 6 inch","","","","","","","","","","","" +,"", "VINT01","23/01/2018",112681,"I",37.50,"",1.00,37.50,"",37.50,"100.00% +","DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",1.00,37.50,"",37.50,"100.00%","","", "","","","","","","","","","","","","", "ZB3501 - # Acme Widget GOLD","","","","","","","","","","","","", "CAPB01","23/01/2018",112672,"I",29.00,"",1.00,29.00,"",29.00,"100.00% +","DEFAULT","", "SANML","23/01/2018",112673,"I",29.00,"",1.00,29.00,"",29.00,"100.00%" +,"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",2.00,58.00,"",58.00,"100.00%","","", "","","","","","","","","","","","","", "ZB3701 - # Acme Widget Large 24 inch","","","","","","","","","",""," +","", "RS01","23/01/2018",112657,"I",21.00,"",1.00,21.00,"",21.00,"100.00%", +"DEFAULT","", "CAPB01","23/01/2018",112672,"I",21.00,"",1.00,21.00,"",21.00,"100.00% +","DEFAULT","", "BPS01","23/01/2018",112679,"I",21.00,"",2.00,42.00,"",42.00,"100.00%" +,"DEFAULT","", "VINT01","23/01/2018",112681,"I",21.00,"",1.00,21.00,"",21.00,"100.00% +","DEFAULT","", "MC01","23/01/2018",112682,"I",21.00,"",1.00,21.00,"",21.00,"100.00%", +"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",6.00,126.00,"",126.00,"100.00%","","" +, "","","","","","","","","","","","","", "ZB3801 - # Regular Acme Widget Large 20 inch","","","","","","","","" +,"","","","", "PIP01","23/01/2018",112670,"I",21.00,"",1.00,21.00,"",21.00,"100.00%" +,"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",1.00,21.00,"",21.00,"100.00%","","", "","","","","","","","","","","","","", "ZB4001 - # High Type Acme Widget","","","","","","","","","","","","" +, "RE01","23/01/2018",112663,"I",30.00,"",1.00,30.00,"",30.00,"100.00%", +"DEFAULT","", "CAPB01","23/01/2018",112672,"I",30.00,"",1.00,30.00,"",30.00,"100.00% +","DEFAULT","", "APP01","23/01/2018",112685,"I",30.00,"",1.00,30.00,"",30.00,"100.00%" +,"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",3.00,90.00,"",90.00,"100.00%","","", "","","","","","","","","","","","","", "ZB401 - # Nugget Type Acme Widget","","","","","","","","","","",""," +", "RR01","23/01/2018",112683,"I",24.00,"",2.00,48.00,"",48.00,"100.00%", +"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",2.00,48.00,"",48.00,"100.00%","","", "","","","","","","","","","","","","", "ZB4901 - # Acme Widget Large 17 inch","","","","","","","","","",""," +","", "TB01","23/01/2018",112665,"I",38.50,"",1.00,38.50,"",38.50,"100.00%", +"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",1.00,38.50,"",38.50,"100.00%","","", "","","","","","","","","","","","","", "ZB501 - # Acme Widget hexangle","","","","","","","","","","","","", "DD01","23/01/2018",112659,"I",33.00,"",3.00,99.00,"",99.00,"100.00%", +"DEFAULT","", "VINT01","23/01/2018",112681,"I",33.00,"",1.00,33.00,"",33.00,"100.00% +","DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",4.00,132.00,"",132.00,"100.00%","","" +, "","","","","","","","","","","","","", "ZB701 - # Acme Widget Circular 20 inch","","","","","","","","","","" +,"","", "RS01","23/01/2018",112657,"I",33.00,"",1.00,33.00,"",33.00,"100.00%", +"DEFAULT","", "PIP01","23/01/2018",112670,"I",33.00,"",1.00,33.00,"",33.00,"100.00%" +,"DEFAULT","", "BPS01","23/01/2018",112679,"I",33.00,"",1.00,33.00,"",33.00,"100.00%" +,"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",3.00,99.00,"",99.00,"100.00%","","", "","","","","","","","","","","","","", "ZB901 - # Acme Widget Square 3 inch","","","","","","","","","","","" +,"", "RE01","23/01/2018",112663,"I",32.00,"",1.00,32.00,"",32.00,"100.00%", +"DEFAULT","", "TB01","23/01/2018",112665,"I",32.00,"",1.00,32.00,"",32.00,"100.00%", +"DEFAULT","", "","","","","","","末末末","末末末末","末末末末","末末末末","末末末 +","","", "","","","","","Product Totals:",2.00,64.00,"",64.00,"100.00%","","", "","","","","","","","","","","","","", 

=============================================

This is the format of the OUTPUT.CSV file I require

DeviceID,PosTxnID,MobileNumber,Timestamp,PosID,PosUserID,ShopID,Prodco +de,ProdDescription,ProdDept,ProdGroup1,ProdGroup2,Quantity,Value ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,,,,,,,,,,,,,

Source Data criteria;

The only fields I need from the original input.csv file are Cust Code, Inv Date, Inv Num , Qty , Net Amt.

I also need the values from Row 2 which contains my Prodcode and my ProdDiscription values in my output.csv file,
these values also occur every time the product code changes in the file

i.e
"ZB101" = Prodcode and "- # Acme Widget Large 20 inch" = ProdDiscription
"ZB1201" = Prodcode and " - # Acme Widget Small 5 inch" = ProdDiscription
"ZB2101" = Prodcode and " - # Acme Widget Large 15 inch" = ProdDiscription etc. etc.

Also
Discard the rows with the following values;

,,,,,,末末末,末末末末,末末末末,末末末末,末末末,< ,,,,,Product Totals:,3,70.5,,70.5,100.00%, ,,,,,,,,,,, And ,,,,,,末末末,末末末末,末末末末,末末末末,末末末, ,,,,,Product Totals:,2,100,,100,100.00%, ,,,,,,,,,,,

etc, etc.

These are the fields I Need in my output.csv file with the 5 field values from the input.csv file

OUTPUT.CSV = INPUT.CSV
---------- -----------
DeviceID = Cust Code (from input.csv)
PosTxnID = Inv Num (from input.csv)
MobileNumber = , (note; these are empty fields)
Timestamp = Inv Date (from input.csv) (The output.csv format required is "YYYY-MM-DD hh:mm" Where hh:mm is always 00:00)
PosID = ,
PosUserID = ,
ShopID = 2345 (static string value of 2345 for all rows with data, no source from input.csv file)
Prodcode = ZB???? The code at beginning of Row2 and at other variable occurrences within the csv file
ProdDiscription = "- #text string " after the ZB??? Prodcode in Row 2 and at other occurences within the csv file. Note the "- #" removed when the split occurs
ProdDept = ,
ProdGroup1 = ,
ProdGroup2 = ,
Quantity = QTY
Value = Net Amt

==============================================
The output.csv file is created as below by the inputfilescript.pl
==============================================

output.csv file

DeviceID,PosTxnID,MobileNumber,Timestamp,PosID,PosUserID,ShopID,Prodco +de,ProdDescription,ProdDept,ProdGroup1,ProdGroup2,Qty,Value VINT01,112681,,"2018-01-23 00:00",,,2345,ZB101,"Acme Widget Large 20 i +nch",,,,1.00,23.50 RR01,112683,,"2018-01-23 00:00",,,2345,ZB101,"Acme Widget Large 20 inc +h",,,,2.00,47.00 VINT01,112681,,"2018-01-23 00:00",,,2345,ZB1201,"Acme Widget Small 5 i +nch",,,,2.00,100.00 PIP01,112670,,"2018-01-23 00:00",,,2345,ZB2101,"Acme Widget Large 15 i +nch",,,,1.00,12.50 VINT01,112681,,"2018-01-23 00:00",,,2345,ZB2101,"Acme Widget Large 15 +inch",,,,1.00,12.50 VCR01,112674,,"2018-01-23 00:00",,,2345,ZB2401,"Acme Widget Medium 10 +inch",,,,2.00,65.00 VINT01,112681,,"2018-01-23 00:00",,,2345,ZB2501,"Acme Widget Small 6 i +nch",,,,1.00,37.50 CAPB01,112672,,"2018-01-23 00:00",,,2345,ZB3501,"Acme Widget GOLD",,,, +1.00,29.00 SANML,112673,,"2018-01-23 00:00",,,2345,ZB3501,"Acme Widget GOLD",,,,1 +.00,29.00 RS01,112657,,"2018-01-23 00:00",,,2345,ZB3701,"Acme Widget Large 24 in +ch",,,,1.00,21.00 CAPB01,112672,,"2018-01-23 00:00",,,2345,ZB3701,"Acme Widget Large 24 +inch",,,,1.00,21.00 BPS01,112679,,"2018-01-23 00:00",,,2345,ZB3701,"Acme Widget Large 24 i +nch",,,,2.00,42.00 VINT01,112681,,"2018-01-23 00:00",,,2345,ZB3701,"Acme Widget Large 24 +inch",,,,1.00,21.00 MC01,112682,,"2018-01-23 00:00",,,2345,ZB3701,"Acme Widget Large 24 in +ch",,,,1.00,21.00 PIP01,112670,,"2018-01-23 00:00",,,2345,ZB3801,"Regular Acme Widget La +rge 20 inch",,,,1.00,21.00 RE01,112663,,"2018-01-23 00:00",,,2345,ZB4001,"High Type Acme Widget", +,,,1.00,30.00 CAPB01,112672,,"2018-01-23 00:00",,,2345,ZB4001,"High Type Acme Widget +",,,,1.00,30.00 APP01,112685,,"2018-01-23 00:00",,,2345,ZB4001,"High Type Acme Widget" +,,,,1.00,30.00 RR01,112683,,"2018-01-23 00:00",,,2345,ZB401,"Nugget Type Acme Widget" +,,,,2.00,48.00 TB01,112665,,"2018-01-23 00:00",,,2345,ZB4901,"Acme Widget Large 17 in +ch",,,,1.00,38.50 DD01,112659,,"2018-01-23 00:00",,,2345,ZB501,"Acme Widget hexangle",,, +,3.00,99.00 VINT01,112681,,"2018-01-23 00:00",,,2345,ZB501,"Acme Widget hexangle", +,,,1.00,33.00 RS01,112657,,"2018-01-23 00:00",,,2345,ZB701,"Acme Widget Circular 20 +inch",,,,1.00,33.00 PIP01,112670,,"2018-01-23 00:00",,,2345,ZB701,"Acme Widget Circular 20 + inch",,,,1.00,33.00 BPS01,112679,,"2018-01-23 00:00",,,2345,ZB701,"Acme Widget Circular 20 + inch",,,,1.00,33.00 RE01,112663,,"2018-01-23 00:00",,,2345,ZB901,"Acme Widget Square 3 inc +h",,,,1.00,32.00 TB01,112665,,"2018-01-23 00:00",,,2345,ZB901,"Acme Widget Square 3 inc +h",,,,1.00,32.00

=============================================
Note the " characters in the fields for Timestamp and ProdDescription above .

Then after running the second script to remove the double quotes;
PERL -pe "s/\x22//g" output.csv > output2.csv We get the output2.csv file below in the final file format required,
ready for sending on for the next process;-

=========================================

output2.csv file <br

DeviceID,PosTxnID,MobileNumber,Timestamp,PosID,PosUserID,ShopID,Prodco +de,ProdDescription,ProdDept,ProdGroup1,ProdGroup2,Qty,Value VINT01,112681,,2018-01-23 00:00,,,2345,ZB101,Acme Widget Large 20 inch +,,,,1.00,23.50 RR01,112683,,2018-01-23 00:00,,,2345,ZB101,Acme Widget Large 20 inch,, +,,2.00,47.00 VINT01,112681,,2018-01-23 00:00,,,2345,ZB1201,Acme Widget Small 5 inch +,,,,2.00,100.00 PIP01,112670,,2018-01-23 00:00,,,2345,ZB2101,Acme Widget Large 15 inch +,,,,1.00,12.50 VINT01,112681,,2018-01-23 00:00,,,2345,ZB2101,Acme Widget Large 15 inc +h,,,,1.00,12.50 VCR01,112674,,2018-01-23 00:00,,,2345,ZB2401,Acme Widget Medium 10 inc +h,,,,2.00,65.00 VINT01,112681,,2018-01-23 00:00,,,2345,ZB2501,Acme Widget Small 6 inch +,,,,1.00,37.50 CAPB01,112672,,2018-01-23 00:00,,,2345,ZB3501,Acme Widget GOLD,,,,1.00 +,29.00 SANML,112673,,2018-01-23 00:00,,,2345,ZB3501,Acme Widget GOLD,,,,1.00, +29.00 RS01,112657,,2018-01-23 00:00,,,2345,ZB3701,Acme Widget Large 24 inch, +,,,1.00,21.00 CAPB01,112672,,2018-01-23 00:00,,,2345,ZB3701,Acme Widget Large 24 inc +h,,,,1.00,21.00 BPS01,112679,,2018-01-23 00:00,,,2345,ZB3701,Acme Widget Large 24 inch +,,,,2.00,42.00 VINT01,112681,,2018-01-23 00:00,,,2345,ZB3701,Acme Widget Large 24 inc +h,,,,1.00,21.00 MC01,112682,,2018-01-23 00:00,,,2345,ZB3701,Acme Widget Large 24 inch, +,,,1.00,21.00 PIP01,112670,,2018-01-23 00:00,,,2345,ZB3801,Regular Acme Widget Large + 20 inch,,,,1.00,21.00 RE01,112663,,2018-01-23 00:00,,,2345,ZB4001,High Type Acme Widget,,,,1 +.00,30.00 CAPB01,112672,,2018-01-23 00:00,,,2345,ZB4001,High Type Acme Widget,,, +,1.00,30.00 APP01,112685,,2018-01-23 00:00,,,2345,ZB4001,High Type Acme Widget,,,, +1.00,30.00 RR01,112683,,2018-01-23 00:00,,,2345,ZB401,Nugget Type Acme Widget,,,, +2.00,48.00 TB01,112665,,2018-01-23 00:00,,,2345,ZB4901,Acme Widget Large 17 inch, +,,,1.00,38.50 DD01,112659,,2018-01-23 00:00,,,2345,ZB501,Acme Widget hexangle,,,,3.0 +0,99.00 VINT01,112681,,2018-01-23 00:00,,,2345,ZB501,Acme Widget hexangle,,,,1 +.00,33.00 RS01,112657,,2018-01-23 00:00,,,2345,ZB701,Acme Widget Circular 20 inc +h,,,,1.00,33.00 PIP01,112670,,2018-01-23 00:00,,,2345,ZB701,Acme Widget Circular 20 in +ch,,,,1.00,33.00 BPS01,112679,,2018-01-23 00:00,,,2345,ZB701,Acme Widget Circular 20 in +ch,,,,1.00,33.00 RE01,112663,,2018-01-23 00:00,,,2345,ZB901,Acme Widget Square 3 inch,, +,,1.00,32.00 TB01,112665,,2018-01-23 00:00,,,2345,ZB901,Acme Widget Square 3 inch,, +,,1.00,32.00

===========================================

My Knowledge of Perl is very limited; I have tried to use the following changes in the modified script file inputfilescriptmods.pl below,
but I then get errors for getline which is out of my depth.

C:\SCRIPTS-PERL>perl inputfilescriptmods.pl Enter the name of the file to read: input.csv Enter the name of the file to write: output.csv Can't locate object method "getline" via package "input.csv" (perhaps +you forgot to load "input.csv"?) at inputfilescriptmods.pl line 36, < +STDIN> line 2. C:\SCRIPTS-PERL>

================================================

These are the modifications I have added to the inputfiescript.pl to ask for filenames, which causes the above error report.

# grab user input.. print "Enter the name of the file to read: "; my $filetoread = <STDIN>; chomp($filetoread); print "Enter the name of the file to write: "; my $filetowrite = <STDIN>; chomp($filetowrite); # ****print "Enter the name of the output file: "; # ****my $fileoutput = <STDIN>; # ****chomp($fileoutput); # ****open $in, "<inputnoquotes.csv" or die "inputnoquotes.csv: $!\n"; open my $readhandle, '<', $filetoread or die "Unable to read [$filetor +ead]!"; # open $in, "<input.csv" or die "input.csv: $!\n"; $csvi = Text::CSV->new( { binary=>1 } ); open my $writehandle, '>', $filetowrite or die "Unable to write [$file +toread]!"; # ****open $out, ">output.csv" or die "output.csv: $!\n"; $csvo = Text::CSV->new( { binary=>1, eol=>$/ } ); $ar = $csvi->getline($filetoread); $col = 0; # ****$ar = $csvi->getline($in); $col = 0; foreach (@{$ar}) { $_ = "Col$col" if $col>7; s/ //; $col++; } $csvi->column_names($ar); $csvo->column_names(@hdr); $csvo->print($out +,\@hdr); while ($hri = $csvi->getline_hr($in)) { if ($hri->{InvDate}) { ($dd,$mm,$yyyy) = split("/",$hri->{InvDate}); $hro = { DeviceID=>$hri->{CustCode}, PosTxnID=>$hri->{InvNum}, M +obileNumber=>'', Timestamp=>$yyyy.'-'.$mm.'-'.$dd.' 00:00', PosID=>'', P +osUserID=>'', ShopID=>'2345', Prodcode=>$code, ProdDescription=>$desc +, ProdDept=>'', ProdGroup1=>'', ProdGroup2=>'', Qty=>$hri->{Qty}, Value +=>$hri->{NetAmt} }; if ($spacer) { $csvo->print($out,\@e14); $spacer = 0; } $csvo->print_hr($out,$hro); } elsif ($hri->{CustCode}) { ($code,$desc) = split(' - # ',$hri->{Cus +tCode}); } elsif ($hri->{LineDisc}) { $spacer = 0; } } # Original code had { $spacer = 1; } I changed this as my input +.csv file originally had a - but then started arriving with - # betw +een the Product code and product description in the input.csv file

Any help re including the removal of the double quotes characters within the inputfilescript.pl file
and also including the script asking for the inputfilename and outputfilenames to process would be greatly appreciated.

Sorry for the long post, I included all files used in the process for clarity, for anyone following the process.

Regards, Tony

  • Comment on Modifying an existing Perl script to ask for input and output filenames and also remove double quotes for outfile file
  • Select or Download Code

Replies are listed 'Best First'.
Re: Modifying an existing Perl script to ask for input and output filenames and also remove double quotes for outfile file
by Marshall (Canon) on Jan 30, 2018 at 05:36 UTC
    I am still considering your first question, but as a comment to your 2nd question:
    2. As the input.csv files are sent to me with different filenames each day, I am renaming these to input.csv for processing by the perl inputfilescript.pl script file. I would like to have the function of the inputfilescript.pl file asking me to type the filenames for the INPUT.csv file name and an OUTPUT.csv so that I do not have to rename filenames for processing.
    One very common way to deal with this is to have "inputfilescript.pl" read from STDIN and write to STDOUT. Then you wind up with:
    inputfilescript <infile.csv >outfile.csv
    Use the file system to stuff an input file into the program and use the file system to create the output file. That way "inputfilescript.pl" doesn't need to be interactive with a user and this is easier to automate. It is of course also possible to have "inputputfilescript" accept 2 parameters on the command line, an infile name and an outfile name. Or perhaps just a single parameter of the infile.csv name and the program automatically creates an output.csv with a similar name. Lot's of possibilities.... Having "inputfilescript.pl" prompt the user (you) for file names in an interactive way is the last thing I would think of.
Re: Modifying an existing Perl script to ask for input and output filenames and also remove double quotes for outfile file
by Tux (Abbot) on Jan 30, 2018 at 07:22 UTC

    See the quote_space attribute. Fields with spaces in them are quoted by default. As this is not required, there is an option to not do so.

    my $csvo = Text::CSV->new ({ binary => 1, eol => $/, quote_space => 0 });

    Enjoy, Have FUN! H.Merijn
Re: Modify an existing PERL script to ask for input and output .csv filenames , also remove double quotes charactor from output file
by Anonymous Monk on Jan 30, 2018 at 02:09 UTC

    ... Text::CSV .. Any help re including the removal of the double quotes characters ...

    Why break files? Anyway, see Text::CSV and try stuff (options controlling quoting and what not)

    ask for input and output

    Programs that ask for stuff are annoying, programs that stop and complain if they don't get what they need are much a joy :D

    see @ARGV or Getopt::Long

    @ARGV or die Usage(); my( $infile, $outfile ) = @ARGV;
Re: Modifying an existing Perl script to ask for input and output filenames and also remove double quotes for outfile file
by Cristoforo (Curate) on Jan 30, 2018 at 17:37 UTC
    Hi ei6eqb and 73 form ka3uca!

    I think I have found a program that will do what you need. The program is: Update: Input file format changed per his post

    #!/usr/bin/perl use strict; use warnings; 3 == @ARGV or die "USAGE: perl $0 filein fileout shopID\n"; my ($file_in, $file_out, $shopID) = @ARGV; # file_in == file3, file_ou +t == fout.csv shopID == 2345 open my $in, '<', $file_in or die "Unable to open $file_in: $!"; open my $out, '>', $file_out or die "Unable to open $file_out: $!"; chomp(my @hdrs = split /,/, <$in>); my @cols = split /\n/, <<EndCols; Cust Code Inv Num MobileNumber Inv Date PosId PosUserID ShopID Prodcode ProdDiscription ProdDept ProdGroup1 ProdGroup2 Qty Net Amt EndCols print $out "DeviceID,PosTxnID,MobileNumber,Timestamp,PosID,PosUserID,S +hopID,Prodcode,ProdDescription,ProdDept,ProdGroup1,ProdGroup2,Qty,Val +ue\n"; my %tmp; while (<$in>) { next unless /^\S+\s-\s#\s/ .. /^,/; chomp; if (/^,/) { %tmp = (); } elsif (/^(ZB\d+)\s-\s#\s([^,]+)/) { $tmp{Prodcode} = $1; $tmp{ProdDiscription} = $2; $tmp{ShopID} = $shopID; } else { @tmp{@hdrs} = split /,/; $tmp{'Inv Date'} = join("/", reverse split '/', $tmp{'Inv Date +'}); $tmp{'Inv Date'} .= ' 00:00'; print $out join(",", map $_ // '', @tmp{@cols}), "\n"; } } close $in or die $!; close $out or die $!
    Run from the command line like perl test2.pl file3 fout.csv 2345 where test2.pl was my program, file3 was my input file, fout.csv is the file to be written and 2345 the shopID.

    I used your input file and got the results:

    DeviceID,PosTxnID,MobileNumber,Timestamp,PosID,PosUserID,ShopID,Prodco +de,ProdDescription,ProdDept,ProdGroup1,ProdGroup2,Qty,Value VINT01,112681,,2018/01/23 00:00,,,2345,ZB101,Acme Widget Large 20 inch +,,,,1,23.5 RR01,112683,,2018/01/23 00:00,,,2345,ZB101,Acme Widget Large 20 inch,, +,,2,47 VINT01,112681,,2018/01/23 00:00,,,2345,ZB1201,Acme Widget Small 5 inch +,,,,2,100 PIP01,112670,,2018/01/23 00:00,,,2345,ZB2101,Acme Widget Large 15 inch +,,,,1,12.5 VINT01,112681,,2018/01/23 00:00,,,2345,ZB2101,Acme Widget Large 15 inc +h,,,,1,12.5 VCR01,112674,,2018/01/23 00:00,,,2345,ZB2401,Acme Widget Medium 10 inc +h,,,,2,65 VINT01,112681,,2018/01/23 00:00,,,2345,ZB2501,Acme Widget Small 6 inch +,,,,1,37.5 CAPB01,112672,,2018/01/23 00:00,,,2345,ZB3501,Acme Widget GOLD,,,,1,29 SANML,112673,,2018/01/23 00:00,,,2345,ZB3501,Acme Widget GOLD,,,,1,29 RS01,112657,,2018/01/23 00:00,,,2345,ZB3701,Acme Widget Large 24 inch, +,,,1,21 CAPB01,112672,,2018/01/23 00:00,,,2345,ZB3701,Acme Widget Large 24 inc +h,,,,1,21 BPS01,112679,,2018/01/23 00:00,,,2345,ZB3701,Acme Widget Large 24 inch +,,,,2,42 VINT01,112681,,2018/01/23 00:00,,,2345,ZB3701,Acme Widget Large 24 inc +h,,,,1,21 MC01,112682,,2018/01/23 00:00,,,2345,ZB3701,Acme Widget Large 24 inch, +,,,1,21 PIP01,112670,,2018/01/23 00:00,,,2345,ZB3801,Regular Acme Widget Large + 20 inch,,,,1,21 RE01,112663,,2018/01/23 00:00,,,2345,ZB4001,High Type Acme Widget,,,,1 +,30 CAPB01,112672,,2018/01/23 00:00,,,2345,ZB4001,High Type Acme Widget,,, +,1,30 APP01,112685,,2018/01/23 00:00,,,2345,ZB4001,High Type Acme Widget,,,, +1,30 RR01,112683,,2018/01/23 00:00,,,2345,ZB401,Nugget Type Acme Widget,,,, +2,48 TB01,112665,,2018/01/23 00:00,,,2345,ZB4901,Acme Widget Large 17 inch, +,,,1,38.5 DD01,112659,,2018/01/23 00:00,,,2345,ZB501,Acme Widget hexangle,,,,3,9 +9 VINT01,112681,,2018/01/23 00:00,,,2345,ZB501,Acme Widget hexangle,,,,1 +,33 RS01,112657,,2018/01/23 00:00,,,2345,ZB701,Acme Widget Circular 20 inc +h,,,,1,33 PIP01,112670,,2018/01/23 00:00,,,2345,ZB701,Acme Widget Circular 20 in +ch,,,,1,33 BPS01,112679,,2018/01/23 00:00,,,2345,ZB701,Acme Widget Circular 20 in +ch,,,,1,33 RE01,112663,,2018/01/23 00:00,,,2345,ZB901,Acme Widget Square 3 inch,, +,,1,32 TB01,112665,,2018/01/23 00:00,,,2345,ZB901,Acme Widget Square 3 inch,, +,,1,32
    The program first checks for 3 parameters on the input line (in @ARGV). It then gets the headers from the input file (to be used in a hash slice further down in the program @tmp{@hdrs} = split /,/;).

    Then, it gets the columns needed to retrieve the data (also in another hash slice further down @tmp{@cols}).

      Removing the use of Text::CSV is a major backward step! Consider what happens if a family of products have titles of the form:

      Nugget Type Acme Widget, blue

      Your code will now generate a badly formed csv file because the comma in the product title is seen as a field separator. With Text::CSV generating the output at least the file is generated correctly and the comma containing title is correctly quoted.

      Premature optimization is the root of all job security
        I agree much with you Grandfather, however he stated in his post-
        When I run the script on my input.csv file the output.csv file which is created by the script has double quotes in two of the data columns, i.e. the Timestamp field "2018-01-23 00:00" and the ProdDescription field "Acme Widget Large 20 inch" These double quotes need to be removed to comply with the required data format of the final csv file.
        So I don't know if he sends the result somewhere, they will be able to parse a proper CSV file.

        Needed additions/changes to the script to make it proper would be:

        Add use Text::CSV; in the header with the other use statements.

        Create a new csv object, my $csv = Text::CSV_XS->new({binary => 1, eol => $/}); before the while loop.

        Instead of the print statement, use $csv->print($out, [ @tmp{@cols} ]);.

      Hi Cristoforo ka3uca , Marshall, Tux, Dallaylaen, GrandFather, and ALL other Monks ,

      I must beg your forgiveness as in my quest to sanitize the input.csv file for publication at this site in my original question I made a Cardinel mistake and I inadvertently pasted an incorrect copy of this file.

      I have now edited my origional question to include the correct version of the INPUT.CSV file , which now means your very king efforts in scripting your code to answer my question does not now work as expected.

      My humble apologies to you and all others for your efforts, for which my mistake has wasted your time.

      Sincere Thanks to you all

      Tony

        Try

        #!/usr/bin/perl use strict; use Text::CSV; my @hdr = qw( DeviceID PosTxnID MobileNumber Timestamp PosID PosUserID ShopID Prodcode ProdDescription ProdDept ProdGroup1 ProdGroup2 Qty Value); my @e14 = ('','','','','','','','','','','','','',''); my $spacer = 0; # grab user input.. print "Enter the name of the file to read: "; my $filetoread = <STDIN>; chomp ($filetoread); print "Enter the name of the file to write: "; my $filetowrite = <STDIN>; chomp ($filetowrite); open my $fh_read, '<', $filetoread or die "Unable to read [$filetoread] : $!"; my $csvi = Text::CSV->new( { binary=>1 } ); open my $fh_write, '>', $filetowrite or die "Unable to write [$filetowrite] : $!"; my $csvo = Text::CSV->new( { binary=>1, eol=>$/, quote_space => 0} ); my $ar = $csvi->getline($fh_read); s/ //g for @$ar; $csvi->column_names($ar); $csvo->column_names(@hdr); $csvo->print($fh_write,\@hdr); my ($code,$desc); print "Reading file $filetoread\n"; while (my $hri = $csvi->getline_hr($fh_read)) { if ($hri->{InvDate}) { my ($dd,$mm,$yyyy) = split "/",$hri->{InvDate}; my $hro = { DeviceID => $hri->{CustCode}, PosTxnID => $hri->{InvNum}, MobileNumber => '', Timestamp => $yyyy.'-'.$mm.'-'.$dd.' 00:00', PosID => '', PosUserID => '', ShopID => '2345', Prodcode => $code, ProdDescription => $desc, ProdDept => '', ProdGroup1=> '', ProdGroup2=> '', Qty => $hri->{Qty}, Value => $hri->{NetAmt} }; if ($spacer) { $csvo->print($fh_write,\@e14); $spacer = 0; } $csvo->print_hr($fh_write,$hro); } elsif ($hri->{CustCode}) { ($code,$desc) = split ' - # ',$hri->{CustCode}; } elsif ($hri->{LineDisc}) { $spacer = 0; } } close $fh_read; close $fh_write; print "Created file $filetowrite\n";
        poj
Technical note - lines too long
by Dallaylaen (Chaplain) on Jan 30, 2018 at 14:05 UTC
    Could you please shorten the =========== lines? Or something in your post that makes the page a bit too wide.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1208088]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2021-01-16 00:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?