Re: Parallel Inserts in CREATE TABLE AS

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2021-05-27 04:57:22
Message-ID: CAFiTN-tCDR0ZfUKBsa55FZPVLT=29g7W01zSSeAAEhdjqJgBxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 27, 2021 at 10:16 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Thu, May 27, 2021 at 7:12 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > I am afraid that the using the FSM seems not get a stable performance gain(at least on my machine),
> > I will take a deep look into this to figure out the difference. A naive idea it that the benefit that bulk extension
> > bring is not much greater than the cost in FSM.
> > Do you have some ideas on it ?
>
> I think, if we try what Amit and I said in [1], we should get some
> insights on whether the bulk relation extension is taking more time or
> the FSM lookup. I plan to share the testing patch adding the timings
> and the counters so that you can also test from your end. I hope
> that's fine with you.

I think some other cause of contention on relation extension locks are
1. CTAS is using a buffer strategy and due to that, it might need to
evict out the buffer frequently for getting the new block in. Maybe
we can identify by turning off the buffer strategy for CTAS and
increasing the shared buffer so that data fits in memory.

2. I think the parallel worker are scanning are producing a lot of
tuple in a short time so the demand for the new block is very high
compare to what AddExtra block is able to produce, so maybe you can
try adding more block by increasing the multiplier and see what is the
impact.

3. Also try where the underlying select query has some complex
condition and also it select fewer record say 50%, 40%...10% and see
what are the numbers.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-05-27 04:59:56 Re: Skip partition tuple routing with constant partition key
Previous Message Bharath Rupireddy 2021-05-27 04:45:49 Re: Parallel Inserts in CREATE TABLE AS