Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

by erix (Parson)
on Apr 12, 2020 at 07:33 UTC ( #11115397=note: print w/replies, xml ) Need Help??


in reply to Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

Postgres nor a DBI dbhandle do threads (as 1nickt already implied).

I don't think doing this kind of parallel access is going to help all that much to get more out of the DB. (But it might be useful for testing, I suppose.)

And it seems to me you're forgetting the hard part: UPDATE. Do you not need that?

  • Comment on Re: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

Replies are listed 'Best First'.
Re^2: Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager
by perlygapes (Sexton) on Apr 14, 2020 at 00:55 UTC
    Thanks erix.

    I do not need to update, actually.
    All I need to do is check if an identical record already exists and INSERT if not, else move onto the next 'candidate record' to insert.
    I will have multiple CPUs all generating candidates for insertion into the table but the records must be unique.
    I know Postgres will tell me if I try to create a duplicate record on a field I have set to UNIQUE, but I am not sure exactly how to check for that in Perl and make sure the script continues rather than dies.
    Incorporating the 'dbh-per-thread/process' idea,
    the pseudo-code would go something like:

    foreach $CPU (@CPUs) { create dbh for this CPU; prepare_cached SELECT; prepare_cached INSERT; foreach combination { # criterion have affinity with CPU core # so criterion is always unique to CPU generate row-data; # check if row already exists; # what the SELECT sth is for if not then { insert new row; # what the INSERT sth is for } } }

      Hi again, maybe what you need is an no-op "upsert", implemented in Postgres with INSERT ON CONFLICT ?

      Something like

      INSERT INTO myTable (foo, bar) VALUES ('baz', 'qux') ON CONFLICT (foo) DO NOTHING;
      ... where foo is your column that has a unique key constraint.

      Hope this helps!


      The way forward always starts with a minimal test.
        Thank you for your suggestion.

        I realise I probably did not explain fully enough my goal.
        Let's say I have 10 columns in my table (excluding the PK).
        I only want to store a data row into the table if the combination of all 10 values is a unique combination set.

        Example with a 3 column table:
        INSERT Row 1: A,A,A <== OK! Value set unique INSERT Row 2: A,D,A <== OK! Value set unique INSERT Row 3: D,T,G <== OK! Value set unique INSERT Row 4: D,A,A <== OK! Value set unique INSERT Row 5: A,D,A <== COLLISION with Row 2! Set not unique; skip IN +SERT
        Thanks. I think I understand what that does, but the unique requirement is not on just one field, it is on the whole record. That is, the collection of all field values comprises a UNIQUE combination, and there must not be any other rows with that exact same combination of values.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11115397]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2020-10-29 06:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (269 votes). Check out past polls.

    Notices?