| From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
|---|---|
| To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Delay locking partitions during INSERT and UPDATE |
| Date: | 2019-01-03 12:34:24 |
| Message-ID: | 0b1531d4-fce5-c7cf-1018-592fb0456124@2ndquadrant.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 11/23/18 1:14 AM, David Rowley wrote:
> As a follow-on from [1] and also discussed in [2], I'd like to propose
> that we don't obtain locks on all partitions during INSERT into a
> partitioned table and UPDATE of a partitioned key and instead, only
> lock the partition when we first route a tuple to it. This means that
> the order that the locks are obtained is no longer well defined and is
> at the mercy of the order that tuples are INSERTed or UPDATEd. It
> seems worth relaxing this a bit for gains in performance, as when a
> partitioned table contains many partitions, the overhead of locking
> all partitions when inserting a single row, or just a few rows is
> often significantly higher than the cost of doing the actual insert.
>
Yep, the locking seems like a significant bottleneck. I've done quite a
bit of testing on two machines, using a slightly modified version of
your test script with variable number of partitions (0 means not
partitioned), and the results look like this:
1) xeon e5-2620v4
partitions 0 100 1000 10000
---------------------------------------------
master 16643 6956 1039 108
patched 16398 15522 15222 13228
2) i5-2500k
partitions 0 100 1000 10000
-----------------------------------------
master 3901 2892 920 76
patched 3894 3838 3845 3522
When using UNLOGGED tables to minimize the external noise, it looks like
this:
3) xeon e5-2620v4
partitions 0 100 1000 10000
--------------------------------------------
master 30806 8740 1091 107
patched 30455 28137 27582 24985
partitions 0 100 1000 10000
--------------------------------------------
master 27662 9013 1277 79
patched 28263 26474 25794 22434
So the performance benefit is pretty clear - up to 2 orders of magnitude
with 10k partitions, and gets us fairly close to non-partitioned table.
Me gusta.
> The current behaviour was added in 54cde0c4c058073 in order to
> minimise deadlock risk. It seems that the risk there only comes from
> AELs that could be taken when a partition directly receives a TRUNCATE
> / CREATE INDEX / VACUUM FULL / CLUSTER. There's obviously no conflict
> with other DML operations since two RowExclusiveLocks don't conflict
> with each other. I think all other AEL obtaining DDL must be
> performed on the top level partitioned table, for example, ADD COLUMN
> can't be done directly on a partition, so there's no added deadlock
> risk from those. For a deadlock to occur one of the above DDL commands
> would have to be executed inside a transaction in an order opposite to
> the order rows are being INSERTed or UPDATEd in the partitioned table.
> If required, such operations could LOCK TABLE the top partitioned
> table to block the DML operation. There's already a risk of similar
> deadlocks from such operations done on multiple separate tables when
> the order they're done is not the same as the order the tables are
> written in a query, although, in that case, the window for the
> deadlock is likely to be much smaller.
>
Hmmm, yeah.
Per the discussion in [1] the locking was necessary also to ensure
partitions can't disappear while we're building the descriptors in
RelationBuildPartitionDesc(). But AFAICS 3f2393edef fixed this.
The other issue - as you note - is ensuring locking order, to prevent
(or rather reduce the risk of) deadlocks. I agree with your assessment
here, i.e. that locking the parent is a sufficient protection.
Maybe there's an alternative solution with the same benefits and not
sacrificing the lock ordering, but I fail to see how it would work.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2019-01-03 12:39:26 | Re: "SELECT ... FROM DUAL" is not quite as silly as it appears |
| Previous Message | Peter Eisentraut | 2019-01-03 12:03:21 | Re: pg_upgrade: Pass -j down to vacuumdb |