Admittedly just for my own fun I did it via a database (postgres 12.0). Don't do this at home --- although it isn't actually too bad, and only /just/ a bit more than a one-liner... I added the missing column names as I assume they are normally there in the real data.
#!/bin/bash
echo "CAT_HEADER,SUPPLIER_CODE,CUSTOMER_CODE,F4,F5,F6,F7,F8
CAT_LINE,0001P,ABC12345,20190924,,1,Z,3.36
CAT_LINE,0002P,ABC12345,20190924,,1,Z,3.36
CAT_LINE,0001P,ABC23456,20190924,,1,Z,2.24
CAT_LINE,0002P,ABC23456,20190924,,1,Z,2.24
CAT_LINE,0001P,ABC34567,20190924,,1,Z,2.24
CAT_LINE,0002P,ABC34567,20190924,,1,Z,2.24" > data.txt
head -n 1 data.txt | perl -ne 'chomp; print "
drop table if exists pm11107044 ;
create table pm11107044 (" . join(",", map {"\"$_\" text"} split(/,/,
+$_)) . ");";
' | psql -qX && < data.txt psql -qXc "copy pm11107044 from stdin with
+(format csv, header true);"
echo "-- unordered data.txt:"
cat data.txt
echo
echo "-- ordered data:"
echo "select * from pm11107044 order by 2, 3" | psql -qX --csv # | m
+d5sum
echo
# older psql doesn't have --csv (introduced in postgres 13); in that
+ case use:
# echo "copy(select * from pm11107044 order by 2, 3) to stdout with (f
+ormat csv, delimiter ',', header true)" | psql -qX
Output:
./pm.pl
-- unordered data.txt:
CAT_HEADER,SUPPLIER_CODE,CUSTOMER_CODE,F4,F5,F6,F7,F8
CAT_LINE,0001P,ABC12345,20190924,,1,Z,3.36
CAT_LINE,0002P,ABC12345,20190924,,1,Z,3.36
CAT_LINE,0001P,ABC23456,20190924,,1,Z,2.24
CAT_LINE,0002P,ABC23456,20190924,,1,Z,2.24
CAT_LINE,0001P,ABC34567,20190924,,1,Z,2.24
CAT_LINE,0002P,ABC34567,20190924,,1,Z,2.24
-- ordered data:
CAT_HEADER,SUPPLIER_CODE,CUSTOMER_CODE,F4,F5,F6,F7,F8
CAT_LINE,0001P,ABC12345,20190924,,1,Z,3.36
CAT_LINE,0001P,ABC23456,20190924,,1,Z,2.24
CAT_LINE,0001P,ABC34567,20190924,,1,Z,2.24
CAT_LINE,0002P,ABC12345,20190924,,1,Z,3.36
CAT_LINE,0002P,ABC23456,20190924,,1,Z,2.24
CAT_LINE,0002P,ABC34567,20190924,,1,Z,2.24
( Another (actually more appropriate) database way would be to read the data file directly as text via file_fdw. Maybe I'll have a go at that tomorrow. )
Edit: I just realised that --csv is in postgresql 13 -- the alternative for earlier versions is mentioned ( COPY (select ...) ... )
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.