Re: Delay locking partitions during INSERT and UPDATE

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Delay locking partitions during INSERT and UPDATE
Date: 2019-01-03 12:34:24
Message-ID: 0b1531d4-fce5-c7cf-1018-592fb0456124@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/23/18 1:14 AM, David Rowley wrote:
> As a follow-on from [1] and also discussed in [2], I'd like to propose
> that we don't obtain locks on all partitions during INSERT into a
> partitioned table and UPDATE of a partitioned key and instead, only
> lock the partition when we first route a tuple to it. This means that
> the order that the locks are obtained is no longer well defined and is
> at the mercy of the order that tuples are INSERTed or UPDATEd. It
> seems worth relaxing this a bit for gains in performance, as when a
> partitioned table contains many partitions, the overhead of locking
> all partitions when inserting a single row, or just a few rows is
> often significantly higher than the cost of doing the actual insert.
>

Yep, the locking seems like a significant bottleneck. I've done quite a
bit of testing on two machines, using a slightly modified version of
your test script with variable number of partitions (0 means not
partitioned), and the results look like this:

1) xeon e5-2620v4

partitions 0 100 1000 10000
---------------------------------------------
master 16643 6956 1039 108
patched 16398 15522 15222 13228

2) i5-2500k

partitions 0 100 1000 10000
-----------------------------------------
master 3901 2892 920 76
patched 3894 3838 3845 3522

When using UNLOGGED tables to minimize the external noise, it looks like
this:

3) xeon e5-2620v4

partitions 0 100 1000 10000
--------------------------------------------
master 30806 8740 1091 107
patched 30455 28137 27582 24985

partitions 0 100 1000 10000
--------------------------------------------
master 27662 9013 1277 79
patched 28263 26474 25794 22434

So the performance benefit is pretty clear - up to 2 orders of magnitude
with 10k partitions, and gets us fairly close to non-partitioned table.

Me gusta.

> The current behaviour was added in 54cde0c4c058073 in order to
> minimise deadlock risk. It seems that the risk there only comes from
> AELs that could be taken when a partition directly receives a TRUNCATE
> / CREATE INDEX / VACUUM FULL / CLUSTER. There's obviously no conflict
> with other DML operations since two RowExclusiveLocks don't conflict
> with each other. I think all other AEL obtaining DDL must be
> performed on the top level partitioned table, for example, ADD COLUMN
> can't be done directly on a partition, so there's no added deadlock
> risk from those. For a deadlock to occur one of the above DDL commands
> would have to be executed inside a transaction in an order opposite to
> the order rows are being INSERTed or UPDATEd in the partitioned table.
> If required, such operations could LOCK TABLE the top partitioned
> table to block the DML operation. There's already a risk of similar
> deadlocks from such operations done on multiple separate tables when
> the order they're done is not the same as the order the tables are
> written in a query, although, in that case, the window for the
> deadlock is likely to be much smaller.
>

Hmmm, yeah.

Per the discussion in [1] the locking was necessary also to ensure
partitions can't disappear while we're building the descriptors in
RelationBuildPartitionDesc(). But AFAICS 3f2393edef fixed this.

The other issue - as you note - is ensuring locking order, to prevent
(or rather reduce the risk of) deadlocks. I agree with your assessment
here, i.e. that locking the parent is a sufficient protection.

Maybe there's an alternative solution with the same benefits and not
sacrificing the lock ordering, but I fail to see how it would work.

>
> [1] https://www.postgresql.org/message-id/flat/CAKJS1f_1RJyFquuCKRFHTdcXqoPX-PYqAd7nz=GVBwvGh4a6xA(at)mail(dot)gmail(dot)com

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-01-03 12:39:26 Re: "SELECT ... FROM DUAL" is not quite as silly as it appears
Previous Message Peter Eisentraut 2019-01-03 12:03:21 Re: pg_upgrade: Pass -j down to vacuumdb