It would be absurd to install postgres for this functionality but if it's available it offers at least two attractive alternatives.
I'll just put my test file here:
#!/bin/sh
echo "OPTION 1: DATA FULLY IMPORTED in temp table"
time (
< INPUTFILE.csv psql -c "
drop table if exists t;
create temporary table t(i1 int, i2 int, n3 numeric);
copy t from stdin with (format csv, delimiter ',');
copy (
select i1,i2,sum(n3) from t group by i1,i2 order by i1,i2
) to stdout with(format csv, delimiter ',');
"
)
echo "-- OPTION 2: DATA read via Foreign Data Wrapper (file_fdw)"
echo " drop foreign table if exists inputfile cascade;
drop server pgcsv cascade; " | psql -qX
time (
echo "
create server pgcsv foreign data wrapper file_fdw;
create foreign table inputfile (
i1 int,
i2 int,
n3 numeric
) server pgcsv
options ( filename '/tmp/INPUTFILE.csv', format 'csv' );
copy (
select i1,i2,sum(n3) from inputfile group by i1,i2 order by i1,
+i2 )
to stdout with(format csv, delimiter ',')
" | psql
)
I thought it was interesting to see the different timings:
-- OPTION 1: DATA FULLY IMPORTED in temp table
real 0m0.048s
user 0m0.002s
sys 0m0.006s
-- OPTION 2: DATA read via file_fdw (foreign data wrapper)
real 0m0.038s
user 0m0.002s
sys 0m0.006s
-- Your program
real 0m0.051s
user 0m0.048s
sys 0m0.003s
Foreign Tables, via Foreign Data Wrapper file_fdw, are handy in the gray area between file and database. There is the next step towards full database via a materialized view (of a foreign table) which offers more real db-features (indexing, for one).