Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^2: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

by perlygapes (Sexton)
on Apr 14, 2020 at 00:55 UTC ( #11115494=note: print w/replies, xml ) Need Help??


in reply to Re: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager
in thread Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

Thanks erix.

I do not need to update, actually.
All I need to do is check if an identical record already exists and INSERT if not, else move onto the next 'candidate record' to insert.
I will have multiple CPUs all generating candidates for insertion into the table but the records must be unique.
I know Postgres will tell me if I try to create a duplicate record on a field I have set to UNIQUE, but I am not sure exactly how to check for that in Perl and make sure the script continues rather than dies.
Incorporating the 'dbh-per-thread/process' idea,
the pseudo-code would go something like:

foreach $CPU (@CPUs) { create dbh for this CPU; prepare_cached SELECT; prepare_cached INSERT; foreach combination { # criterion have affinity with CPU core # so criterion is always unique to CPU generate row-data; # check if row already exists; # what the SELECT sth is for if not then { insert new row; # what the INSERT sth is for } } }

  • Comment on Re^2: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager
  • Download Code

Replies are listed 'Best First'.
Re^3: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager
by 1nickt (Abbot) on Apr 14, 2020 at 03:19 UTC

    Hi again, maybe what you need is an no-op "upsert", implemented in Postgres with INSERT ON CONFLICT ?

    Something like

    INSERT INTO myTable (foo, bar) VALUES ('baz', 'qux') ON CONFLICT (foo) DO NOTHING;
    ... where foo is your column that has a unique key constraint.

    Hope this helps!


    The way forward always starts with a minimal test.
      Thank you for your suggestion.

      I realise I probably did not explain fully enough my goal.
      Let's say I have 10 columns in my table (excluding the PK).
      I only want to store a data row into the table if the combination of all 10 values is a unique combination set.

      Example with a 3 column table:
      INSERT Row 1: A,A,A <== OK! Value set unique INSERT Row 2: A,D,A <== OK! Value set unique INSERT Row 3: D,T,G <== OK! Value set unique INSERT Row 4: D,A,A <== OK! Value set unique INSERT Row 5: A,D,A <== COLLISION with Row 2! Set not unique; skip IN +SERT
      Thanks. I think I understand what that does, but the unique requirement is not on just one field, it is on the whole record. That is, the collection of all field values comprises a UNIQUE combination, and there must not be any other rows with that exact same combination of values.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11115494]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2020-10-26 16:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (252 votes). Check out past polls.

    Notices?