From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Greg Nancarrow <gregn4422(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel INSERT (INTO ... SELECT ...) |
Date: | 2020-10-09 09:42:05 |
Message-ID: | CAA4eK1LKBUQFC=UbFTFGM2iVGbcGex2D5J+x+_XKE=0N2==jrQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 9, 2020 at 2:37 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> Speaking of costing, I'm not sure I really agree with the current
> costing of a Gather node. Just considering a simple Parallel SeqScan
> case, the "run_cost += parallel_tuple_cost * path->path.rows;" part of
> Gather cost always completely drowns out any other path costs when a
> large number of rows are involved (at least with default
> parallel-related GUC values), such that Parallel SeqScan would never
> be the cheapest path. This linear relationship in the costing based on
> the rows and a parallel_tuple_cost doesn't make sense to me. Surely
> after a certain amount of rows, the overhead of launching workers will
> be out-weighed by the benefit of their parallel work, such that the
> more rows, the more likely a Parallel SeqScan will benefit.
>
That will be true for the number of rows/pages we need to scan not for
the number of tuples we need to return as a result. The formula here
considers the number of rows the parallel scan will return and the
more the number of rows each parallel node needs to pass via shared
memory to gather node the more costly it will be.
We do consider the total pages we need to scan in
compute_parallel_worker() where we use a logarithmic formula to
determine the number of workers.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-10-09 09:56:55 | Re: Parallel copy |
Previous Message | Peter Eisentraut | 2020-10-09 09:40:40 | Re: SEARCH and CYCLE clauses |