From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com> |
Subject: | Re: Parallel Inserts in CREATE TABLE AS |
Date: | 2020-12-24 04:55:06 |
Message-ID: | CALDaNm1XpwEoHS9U_zZ2GPSZ_qKZAc=VSa4VTO66J6k5Gzr=8Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy
> Attaching v14 patch set that has above changes. Please consider this
> for further review.
>
Few comments:
In the below case, should create be above Gather?
postgres=# explain create table t7 as select * from t6;
QUERY PLAN
-------------------------------------------------------------------
Gather (cost=0.00..9.17 rows=0 width=4)
Workers Planned: 2
-> Create t7
-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4)
(4 rows)
Can we change it to something like:
-------------------------------------------------------------------
Create t7
-> Gather (cost=0.00..9.17 rows=0 width=4)
Workers Planned: 2
-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4)
(4 rows)
You could change intoclause_len = strlen(intoclausestr) to
strlen(intoclausestr) + 1 and use intoclause_len in the remaining
places. We can avoid the +1 in the other places.
+ /* Estimate space for into clause for CTAS. */
+ if (IS_CTAS(intoclause) && OidIsValid(objectid))
+ {
+ intoclausestr = nodeToString(intoclause);
+ intoclause_len = strlen(intoclausestr);
+ shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + 1);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
Can we use node->nworkers_launched == 0 in place of
node->need_to_scan_locally, that way the setting and resetting of
node->need_to_scan_locally can be removed. Unless need_to_scan_locally
is needed in any of the functions that gets called.
+ /* Enable leader to insert in case no parallel workers were launched. */
+ if (node->nworkers_launched == 0)
+ node->need_to_scan_locally = true;
+
+ /*
+ * By now, for parallel workers (if launched any), would have
started their
+ * work i.e. insertion to target table. In case the leader is chosen to
+ * participate for parallel inserts in CTAS, then finish its
share before
+ * going to wait for the parallel workers to finish.
+ */
+ if (node->need_to_scan_locally)
+ {
Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Li Japin | 2020-12-24 05:32:05 | Cannot ship records to subscriber for partition tables using logical replication (publish_via_partition_root=false) |
Previous Message | Michael Paquier | 2020-12-24 04:23:40 | Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs |