From: | "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> |
---|---|
To: | "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com> |
Subject: | RE: Parallel INSERT (INTO ... SELECT ...) |
Date: | 2021-02-10 10:46:11 |
Message-ID: | b50e9012a81f410fa86f871f10551e56@G08CNEXMBPEKD05.g08.fujitsu.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > > It did have performance gain, but I think it's not huge enough to
> > > ignore the extra's index cost.
> > > What do you think ?
> >
> > Yes... as you suspect, I'm afraid the benefit from parallel bitmap
> > scan may not compensate for the loss of the parallel insert operation.
> >
> > The loss is probably due to 1) more index page splits, 2) more buffer
> > writes (table and index), and 3) internal locks for things such as
> > relation extension and page content protection. To investigate 3), we
> > should want something like [1], which tells us the wait event
> > statistics (wait count and time for each wait event) per session or
> > across the instance like Oracke, MySQL and EDB provides. I want to
> continue this in the near future.
>
> What would the result look like if you turn off
> parallel_leader_participation? If the leader is freed from
> reading/writing the table and index, the index page splits and internal
> lock contention may decrease enough to recover part of the loss.
>
> https://www.postgresql.org/docs/devel/parallel-plans.html
>
> "In a parallel bitmap heap scan, one process is chosen as the leader. That
> process performs a scan of one or more indexes and builds a bitmap indicating
> which table blocks need to be visited. These blocks are then divided among
> the cooperating processes as in a parallel sequential scan. In other words,
> the heap scan is performed in parallel, but the underlying index scan is
> not."
If I disable parallel_leader_participation.
For max_parallel_workers_per_gather = 4, It still have performance degradation.
For max_parallel_workers_per_gather = 2, the performance degradation will not happen in most of the case.
There is sometimes a noise(performance degradation), but most of result(about 80%) is good.
Best regards,
houzj
Attachment | Content-Type | Size |
---|---|---|
parallel_insert.txt | text/plain | 3.6 KB |
parallel_insert_without_leader_participation.txt | text/plain | 3.0 KB |
serial_insert.txt | text/plain | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2021-02-10 11:46:42 | Re: pg_get_first_normal_oid()? |
Previous Message | Etsuro Fujita | 2021-02-10 10:31:02 | Re: Asynchronous Append on postgres_fdw nodes. |