Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-03 12:20:22
Message-ID: CAJcOf-fYD9GaB_Pan7iOoLHU6z7G_e=CETG-A+oG-FutwGL_XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 3, 2021 at 10:45 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Mon, Mar 1, 2021 at 9:08 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> >
> > Posting an updated set of patches that includes Amit Langote's patch
> > to the partition tracking scheme...
> > (the alternative of adding partitions to the range table needs further
> > investigation)
>
> I was reviewing your latest patch and I have a few comments.
>
> In patch 0001
> 1.
> +static bool
> +target_rel_max_parallel_hazard_recurse(Relation rel,
> + CmdType command_type,
> + max_parallel_hazard_context *context)
> +{
> + TupleDesc tupdesc;
> + int attnum;
> +
> + /* Currently only CMD_INSERT is supported */
> + Assert(command_type == CMD_INSERT);
> …….
> + /*
> + * Column default expressions and check constraints are only applicable to
> + * INSERT and UPDATE, but since only parallel INSERT is currently supported,
> + * only command_type==CMD_INSERT is checked here.
> + */
> + if (command_type == CMD_INSERT)
>
> If we have an assert at the beginning of the function, then why do we
> want to put the if check here?
>

Asserts are normally only enabled in a debug-build, so for a
release-build that Assert has no effect.
The Assert is being used as a sanity-check that the function is only
currently getting called for INSERT, because that's all it currently
supports.
Further code below specifically checks for INSERT because the block
contains code that is specific to INSERT (and actually DELETE too, but
that is not currently supported - code is commented accordingly).
In future, this function will support DELETE and UPDATE, and the
Assert serves to alert the developer ASAP that it needs updating to
support those.

> 2.
> In patch 0004, We are still charging the parallel_tuple_cost for each
> tuple, are we planning to do something about this? I mean after this
> patch tuple will not be transferred through the tuple queue, so we
> should not add that cost.
>

I believe that for Parallel INSERT, cost_modifytable() will set
path->path.rows to 0 (unless there is a RETURNING list), so, for
example, in cost_gather(), it will not add to the run_cost as
"run_cost += parallel_tuple_cost * path->path.rows;"

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-03 12:21:10 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Hannu Krosing 2021-03-03 12:10:31 Re: We should stop telling users to "vacuum that database in single-user mode"