From: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Subject: | RE: Parallel Inserts in CREATE TABLE AS |
Date: | 2021-05-28 08:47:01 |
Message-ID: | OS0PR01MB5716511F11F6E70405F0C96694229@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Sent: Thursday, May 27, 2021 10:07 PM
> On Thu, May 27, 2021 at 9:53 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> > > One idea to find this out could be that we have three counters for
> > > each worker which counts the number of times each worker extended
> > > the relation in bulk, the number of times each worker extended the
> > > relation by one block, the number of times each worker gets the page
> > > from FSM. It might be possible that with this we will be able to
> > > figure out why there is a difference between your and Hou-San's
> > > results.
> >
> > Yeah, that helps. And also, the time spent in
> > LockRelationForExtension, ConditionalLockRelationForExtension,
> > GetPageWithFreeSpace and RelationAddExtraBlocks too can give some
> > insight.
> >
> > My plan is to have a patch with above info added in (which I will
> > share it here so that others can test and see the results too) and run
> > the "case 4" where there's a regression seen on my system.
>
> I captured below information with the attached patch
> 0001-test-times-and-block-counts.patch applied on top of CTAS v23 patch set.
> Testing details are attached in the file named "test".
> Total time spent in LockRelationForExtension Total time spent in
> GetPageWithFreeSpace Total time spent in RelationAddExtraBlocks Total
> number of times extended the relation in bulk Total number of times extended
> the relation by one block Total number of blocks added in bulk extension Total
> number of times getting the page from FSM
>
> Here is a summary of what I observed:
> 1) The execution time with 2 workers, without TABLE_INSERT_SKIP_FSM
> (140 sec) is more than with 0 workers (112 sec)
> 2) The execution time with 2 workers, with TABLE_INSERT_SKIP_FSM (225
> sec) is more than with 2 workers, without TABLE_INSERT_SKIP_FSM (140
> sec)
> 3) Majority of the time is going into waiting for relation extension lock in
> LockRelationForExtension. With 2 workers, without TABLE_INSERT_SKIP_FSM,
> out of total execution time 140 sec, the time spent in LockRelationForExtension
> is ~40 sec and the time spent in RelationAddExtraBlocks is ~20 sec. So, ~60 sec
> are being spent in these two functions. With 2 workers, with
> TABLE_INSERT_SKIP_FSM, out of total execution time 225 sec, the time spent
> in LockRelationForExtension is ~135 sec and the time spent in
> RelationAddExtraBlocks is 0 sec (because we skip FSM, no bulk extend logic
> applies). So, most of the time is being spent in LockRelationForExtension.
>
> I'm still not sure why the execution time with 0 workers (or serial execution or
> no parallelism involved) on my testing system is 112 sec compared to 58 sec on
> Hou-San's system for the same use case. Maybe the testing system I'm using is
> not of the latest configuration compared to others.
>
> Having said that, I request others to try and see if the same observations (as
> above) are made on their testing systems for the same use case. If others don't
> see regression (with just 2 workers) or they observe not much difference with
> and without TABLE_INSERT_SKIP_FSM.
Thanks for the patch !
I attached my test results. Note I did not change the wal_level to minimal.
I only change the the following configuration:
shared_buffers = 40GB
max_worker_processes = 32
max_parallel_maintenance_workers = 24
max_parallel_workers = 32
synchronous_commit = off
checkpoint_timeout = 1d
max_wal_size = 24GB
min_wal_size = 15GB
autovacuum = off
Best regards,
houzj
Attachment | Content-Type | Size |
---|---|---|
test_results | application/octet-stream | 12.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2021-05-28 09:04:12 | Re: Support for NSS as a libpq TLS backend |
Previous Message | Greg Nancarrow | 2021-05-28 08:41:42 | Re: Parallel INSERT SELECT take 2 |