Re: Parallel INSERT SELECT take 2

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel INSERT SELECT take 2
Date: 2021-04-27 03:34:45
Message-ID: CALj2ACU6hoRqK6Ehc6wTbXb8cdurs8LV2vkzXSXUdtyR=L2Tbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 27, 2021 at 7:39 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > I'm thinking that when users say ALTER TABLE partioned_table SET PARALLEL
> > TO 'safe';, we check all the partitions' and their associated objects' parallel
> > safety? If all are parallel safe, then only we set partitioned_table as parallel safe.
> > What should happen if the parallel safety of any of the associated
> > objects/partitions changes after setting the partitioned_table safety?
>
> Currently, nothing happened if any of the associated objects/partitions changes after setting the partitioned_table safety.
> Because , we do not have a really cheap way to catch the change. The existing relcache does not work because alter function
> does not invalid the relcache which the function belongs to. And it will bring some other overhead(locking, systable scan,...)
> to find the table the objects belong to.

Makes sense. Anyways, the user is responsible for such changes and
otherwise the executor can catch them at run time, if not, the users
will see unintended consequences.

> > My understanding was that: the command ALTER TABLE ... SET PARALLEL TO
> > 'safe' work will check the parallel safety of all the objects associated with the
> > table. If the objects are all parallel safe, then the table will be set to safe. If at
> > least one object is parallel unsafe or restricted, then the command will fail.
>
> I think this idea makes sense. Some detail of the designed can be improved.
> I agree with you that we can try to check check all the partitions' and their associated objects' parallel safety when ALTER PARALLEL.
> Because it's a separate new command, add some overhead to it seems not too bad.
> If there are no other objections, I plan to add safety check in the ALTER PARALLEL command.

Maybe we can make the parallel safety check of the associated
objects/partitions optional for CREATE/ALTER DDLs, with the default
being no checks performed. Both Greg and Amit agree that we don't have
to perform any parallel safety checks during CREATE/ALTER DDLs.

> > also thinking that how will the design cope with situations such as the parallel
> > safety of any of the associated objects changing after setting the table to
> > parallel safe. The planner would have relied on the outdated parallel safety of
> > the table and chosen parallel inserts and the executor will catch such situations.
> > Looks like my understanding was wrong.
>
> Currently, we assume user is responsible for its correctness.
> Because, from our research, when the parallel safety of some of these objects is changed,
> it's costly to reflect it on the parallel safety of tables that depend on them.
> (we need to scan the pg_depend,pg_inherit,pg_index.... to find the target table)

Agree.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-04-27 03:40:24 Re: Parallel INSERT SELECT take 2
Previous Message Masahiko Sawada 2021-04-27 03:28:12 Re: Replication slot stats misgivings