Re: Savepoints in transactions for speed?

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Savepoints in transactions for speed?
Date: 2012-11-28 23:28:51
Message-ID: 1354145331.1766.84.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2012-11-27 at 16:04 -0600, Mike Blackwell wrote:
> I need to delete about 1.5 million records from a table and reload it
> in one transaction. The usual advice when loading with inserts seems
> to be group them into transactions of around 1k records. Committing
> at that point would leave the table in an inconsistent state. Would
> issuing a savepoint every 1k or so records negate whatever downside
> there is to keeping a transaction open for all 1.5 million records, or
> just add more overhead?

A large transaction isn't really a big problem for postgres, and 1.5M
records should be processed quickly anyway.

The main problem with a long-running delete or update transaction is
that the dead tuples (deleted tuples or the old version of an updated
tuple) can't be removed until the transaction finishes. That can cause
temporary "bloat", but 1.5M records shouldn't be noticeable.

Adding subtransactions into the mix won't help, but probably won't hurt,
either. The transaction will still run just as long, and you still can't
delete the tuples ahead of time (unless you abort a subtransaction). If
you *do* use subtransactions, make sure to release them as quickly as
you create them (don't just use ROLLBACK TO, that still leaves the
savepoint there); having 1500 open subtransactions might cause
performance problems elsewhere.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Niels Kristian Schjødt 2012-11-29 03:32:11 Re: Optimize update query
Previous Message Jeff Davis 2012-11-28 23:18:42 Re: Savepoints in transactions for speed?