pqxx accumulating a transaction

From: Jeff Abrahamson <jeff(at)p27(dot)eu>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: pqxx accumulating a transaction
Date: 2021-12-09 22:47:08
Message-ID: 104f5ce7-823d-69dd-df32-fbbc1aace857@p27.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am accumulating a transaction using libpqxx and suspect I'm doing
something overly tortuous to get where I'm going.

I don't mean to make this question longer than it needs to be, but I
think context will help.  So I'll start by explaining briefly how I got
to where I am.  And then I'll show where I am, which I'm pretty sure is
abusive and not how libpqxx means for me to do things.

*Intro: the naive approach to doing thousands of inserts*

I have some code that wants to write a bunch of stuff to the database,
typically |INSERT ... ON CONFLICT UPDATE|. Roughly, this code is looping
over some container and generating the needed SQL to insert each
container object that needs inserting.

*The naive way to do this* (skipping error/exception handling for the
moment) is thus:

pqxx::work txn(conn);
ostringstream sql_stmt;
Container rows;
for (const auto& row : rows) {
    if (row.IsDirty()) {
        RowToSQL(sql_stmt, row, txn);
        txn.exec(sql_stmt.str());
        // Clear sql_stmt here.
    }
}
txn.commit();

The function RowToSQL() takes a transaction object so that it can quote
strings appropriately using txn.quote().

This is inefficient, however: calling exec() over and over turns out to
be quite slow.

So instead I build up a bunch of statements in the ostringstream, thus:

pqxx::work txn(conn);
ostringstream sql_stmt;
Container rows;
for (const auto& row : rows) {
    if (row.IsDirty()) {
        RowToSQL(sql_stmt, row, txn);
        if (++counter > kExecuteThreshold ||
sql_stmt.IsGettingRatherBig()) {
            txn.exec(sql_stmt.str());
            // Clear sql_stmt, reset counter here.
        }
        if (count > kCommitThreshold) {
            txn.commit();
            // Reset statement counter here.
        }
    }
}
// Final commit here.

I chose the two thresholds based on performance testing in our
environment.  Think 100 and 10,000 for order of magnitude.

This worked until it didn't, because re-using the transaction this way
leads to conflicts.

Attempt to activate transaction<READ COMMITTED> which is already closed.

I mostly understand all this.

This leads me to write something that has weird code smell, and I so
suspect I've misunderstood how postgresql / libpqxx intend to be used.

Adding back error/exception handling, I've now written this based on
lots of doc reading:

*Less naive but weird code smell*

*// Has weird code smell.*

pqxx::nontransaction non_txn;
ostringstream sql_stmt;
Container rows;
Vector<string> transactions;
for (const auto& row : rows) {
    if (row.IsDirty()) {
        RowToSQL(sql_stmt, row, non_txn);
        if (++counter > kExecuteThreshold ||
sql_stmt.IsGettingRatherBig()) {
            transactions.push_back(sql_stmt.str());
            // Clear sql_stmt, reset counter here.
        }
        if (count > kCommitThreshold) {
            try {
                pqxx::work txn(conn);
                for (const string& trans : transactions) {
                    txn.exec(trans);
                }
                txn.commit();
                transactions.clear();
                // Reset statement counter here.
            } catch (const exception& e) {
                txn.abort();    // Pedantic, happens on destruction.
                YellowAlert();  // Something appropriate.
            }
        }
    }
}
// Final commit here.

It seems quite wrong to me that I should build up this vector of things
to execute and maintain these custom-tuned execution/transaction
thresholds rather than using some facility of libpqxx. That is, this
pattern seems common enough to me that the complexity I'm starting to
see strikes me as my own misunderstanding.

Any pointers much appreciated.

||

--
Jeff Abrahamson
+33 6 24 40 01 57
+44 7920 594 255

http://p27.eu/jeff/
http://mobilitains.fr/

Browse pgsql-general by date

  From Date Subject
Next Message PGSQL DBA 2021-12-10 02:19:45 Need to know more about pg_test_fsync utility
Previous Message Peter J. Holzer 2021-12-09 19:52:30 Re: Detecting repeated phrase in a string