How to avoid UPDATE performance degradation in a transaction

From: Karl Düüna <karl(dot)dyyna(at)gmail(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: How to avoid UPDATE performance degradation in a transaction
Date: 2020-02-13 10:21:17
Message-ID: CAMNADob8FVyH7WxjLqDkK4z4DFtPfHP7cyt62ze6q4Cf0pdCyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi

I recently came across a performance problem with a big transaction block,
which doesn't make sense to me and hopefully someone more knowledgeable can
explain the reasons and point out a direction for a solution.

-- TL; DR;

UPDATE on a row takes relatively constant amount of time outside a
transaction block, but running UPDATE on a single row over and over inside
a transaction gets slower and slower as the number of UPDATE operations
increases.

Why is updating the same row large number of times progressively slower
inside a transaction? And is there a way to avoid this performance
degradation?

I set up a POC repository to demonstrate the problem:
https://github.com/DeadAlready/pg-test

-- Backstory

Needed to run a large block of operations (a mix of inserts and updates) on
a table. It took a considerable amount of time inside a transaction and was
about 10x faster without the transaction. Since I need all the operations
to run as a single block that can be rolled back this was unsatisfactory.
Thus began my quest to locate the problem. Since the actual data structure
is complex and involves a bunch of triggers, foreign keys etc it took some
time to narrow down, but in the end I found that the structure itself is
irrelevant. The issue occurs even if you have a single two column table
with a handful of rows. The only requirement seems to be that the NR of
UPDATEs per single row is large. While the update performance inside a
transaction starts out faster than outside, the performance starts to
degrade from the get go. It really isn't noticeable until about 5k UPDATEs
on a single row. At around 100k UPDATEs it is about 2.5x slower than the
same operation outside the transaction block and about 4x slower than at
the beginning of the transaction.

Thanks,
Karl

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David G. Johnston 2020-02-13 20:50:38 Re: How to avoid UPDATE performance degradation in a transaction
Previous Message Chris Borckholder 2020-02-12 08:12:34 Re: Bad selectivity estimate when using a sub query to determine WHERE condition