RE: Performance improvements of INSERTs to a partitioned table

From: "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>
To: 'David Rowley' <david(dot)rowley(at)2ndquadrant(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Performance improvements of INSERTs to a partitioned table
Date: 2018-11-09 09:36:41
Message-ID: 25C1C6B2E7BE044889E4FE8643A58BA963B5879C@G01JPEXMBKW03
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks David.

> I'm highly against doing 'c' since both [1] and [3] are very useful patches which scale partitioning to work well with very large numbers of partitions.

Agreed.

>'a' is not that ideal a solution either as it means that anyone doing CREATE INDEX or TRUNCATE on a partition would conflict with queries that are querying an unrelated part of the partitioned table.

If only deadlock is a problem, I think intention lock is needed.
All SQL on a partitioned tables take a intention exclusive or shared lock on the root first and take a conventional lock on target leaf next.

> While I don't really like 'b' either, I wonder how many people are going to hit deadlocks here.
>To hit that, I believe that you'd have to be doing the TRUNCATE or CREATE INDEX inside a transaction and to hit it where you couldn't hit it before you'd have had to carefully written your CREATE INDEX / TRUNCATE statements to ensure they're executed in partition order.

In case of CREATE INDEX on the leaf, users have to ensure to statements executed in partition order since similar case could occur in no partitioned table.
But, DBMS have to prevent deadlock occurred by CREATE INDEX/TRUNCATE on the root.

>Perhaps a GUC is needed to choose between 'a' and 'b'?

I agree with Amit. It is difficult to choose for users.

regards,
Sho Kato
> -----Original Message-----
> From: David Rowley [mailto:david(dot)rowley(at)2ndquadrant(dot)com]
> Sent: Friday, November 9, 2018 11:20 AM
> To: Kato, Sho/加藤 翔 <kato-sho(at)jp(dot)fujitsu(dot)com>; Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>; Robert Haas <robertmhaas(at)gmail(dot)com>
> Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
> Subject: Re: Performance improvements of INSERTs to a partitioned table
>
> On 7 November 2018 at 21:31, Kato, Sho <kato-sho(at)jp(dot)fujitsu(dot)com> wrote:
> > AFAIK, When CREATE INDEX on a partition and INSERT to a parent table
> > are executed at the same time, this patch causes deadlock.
> >
> > * partitions information
> >
> > Partition key: RANGE (a)
> > Partitions: a_1 FOR VALUES FROM (1) TO (100),
> > a_2 FOR VALUES FROM (100) TO (200)
> >
> > T1: create index a_1_ix on a_1(a);
> > T2: insert into a values(101),(1); locking a_2 and waiting releasing
> > a_1’s lock
> > T1: create index a_2_ix on a_2(a); ERROR: deadlock detected
> >
> > I think this situation does not mean unsafe because similar situation
> > will occurs at no partitioned tables and DBMS could not prevent this
> situation.
> >
> > But, I'm not sure this behavior is correct.
>
> That's one case that will deadlock with the 0002 patch in [1]. Another
> command that takes a conflicting lock on the partition only is TRUNCATE.
> Many other commands that normally take an AEL can't be run on a partition,
> for example, ALTER TABLE ADD COLUMN.
>
> The same would have deadlocked on master if you'd created the indexes
> in the reverse order, but I guess the point is that today there's some
> defined order that you can create the indexes in that won't cause a
> deadlock, but with [1] there is no defined order since the tables are
> locked in order that tuples are routed to them.
>
> Robert pointed out something interesting in [2] about UPDATEs of a
> partition key perhaps routing a tuple to a partition that's been excluded
> by constraint exclusion, so the lock is not taken initially, but would
> be taken later in ExecSetupPartitionTupleRouting(). I ran through that
> scenario today and discovered it can't happen as excluded partitions are
> still in the rangetable, so are still locked either in the planner or
> by AcquireExecutorLocks() and the order those are locked in is well
> defined from the planner. However, I believe that's something Amit plans
> to change in [3], where he proposes to only lock partitions that survive
> partition pruning (among many other things).
>
> There are probably a few things we could do here:
>
> a. Have all DDL on partitions obtain the same lock level on all ancestor
> partitioned tables taking a lock on the root first and working down to
> the leaf.
> b. Just lock what we touch and increase the likelihood of deadlocks
> occurring.
> c. Reject [1] and [3] because we don't want to increase the chances of
> deadlocks.
>
> I'm highly against doing 'c' since both [1] and [3] are very useful patches
> which scale partitioning to work well with very large numbers of
> partitions. At the moment we just can bearly do half a dozen partitions
> before the performance starts going south. 'a' is not that ideal a solution
> either as it means that anyone doing CREATE INDEX or TRUNCATE on a
> partition would conflict with queries that are querying an unrelated part
> of the partitioned table. While I don't really like 'b' either, I wonder
> how many people are going to hit deadlocks here.
> To hit that, I believe that you'd have to be doing the TRUNCATE or CREATE
> INDEX inside a transaction and to hit it where you couldn't hit it before
> you'd have had to carefully written your CREATE INDEX / TRUNCATE
> statements to ensure they're executed in partition order. I'm unsure how
> likely that is, but I certainly can't rule out that doing 'b' won't upset
> anyone.
>
> Perhaps a GUC is needed to choose between 'a' and 'b'?
>
> I'm adding Amit to this email because [3] is his and Robert because he's
> recently been talking about partition locking too.
>
> [1]
> https://www.postgresql.org/message-id/CAKJS1f-vYAHqqaU878Q4cdZYHwPcv
> 1J_C-mG%3DCs2UwhsD6cqwg%40mail.gmail.com
> [2]
> https://www.postgresql.org/message-id/CA%2BTgmoZGJsy-nRFnzurhZQJtHdD
> h5fzJKvbvhS0byN6_46pB9Q%40mail.gmail.com
> [3] https://commitfest.postgresql.org/20/1778/
>
> --
> David Rowley http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2018-11-09 10:40:44 Re: csv format for psql
Previous Message Etsuro Fujita 2018-11-09 09:27:09 Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT