From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Append implementation |
Date: | 2017-03-09 01:52:32 |
Message-ID: | CA+TgmoaJ3+KYFusQj6cqcsi_jSTt6V=3PKCit=c=t4nsZty0gw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 8, 2017 at 2:00 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> Yeah, that seems to be right in most of the cases. The only cases
> where your formula seems to give too few workers is for something like
> : (2, 8, 8). For such subplans, we should at least allocate 8 workers.
> It turns out that in most of the cases in my formula, the Append
> workers allocated is just 1 worker more than the max per-subplan
> worker count. So in (2, 1, 1, 8), it will be a fraction more than 8.
> So in the patch, in addition to the log2() formula you proposed, I
> have made sure that it allocates at least equal to max(per-subplan
> parallel_workers values).
Yeah, I agree with that.
Some review:
+typedef struct ParallelAppendDescData
+{
+ slock_t pa_mutex; /* mutual exclusion to choose
next subplan */
+ ParallelAppendInfo pa_info[FLEXIBLE_ARRAY_MEMBER];
+} ParallelAppendDescData;
Instead of having ParallelAppendInfo, how about just int
pa_workers[FLEXIBLE_ARRAY_MEMBER]? The second structure seems like
overkill, at least for now.
+static inline void
+exec_append_scan_first(AppendState *appendstate)
+{
+ appendstate->as_whichplan = 0;
+}
I don't think this is buying you anything, and suggest backing it out.
+ /* Backward scan is not supported by parallel-aware plans */
+ Assert(!ScanDirectionIsBackward(appendstate->ps.state->es_direction));
I think you could assert ScanDirectionIsForward, couldn't you?
NoMovement, I assume, is right out.
+ elog(DEBUG2, "ParallelAppend : pid %d : all plans already
finished",
+ MyProcPid);
Please remove (and all similar cases also).
+ sizeof(*node->as_padesc->pa_info) * node->as_nplans);
I'd use the type name instead.
+ for (i = 0; i < node->as_nplans; i++)
+ {
+ /*
+ * Just setting all the number of workers to 0 is enough. The logic
+ * of choosing the next plan in workers will take care of everything
+ * else.
+ */
+ padesc->pa_info[i].pa_num_workers = 0;
+ }
Here I'd use memset.
+ return (min_whichplan == PA_INVALID_PLAN ? false : true);
Maybe just return (min_whichplan != PA_INVALID_PLAN);
- childrel->cheapest_total_path);
+
childrel->cheapest_total_path);
Unnecessary.
+ {
partial_subpaths = accumulate_append_subpath(partial_subpaths,
linitial(childrel->partial_pathlist));
+ }
Don't need to add braces.
+ /*
+ * Extract the first unparameterized, parallel-safe one among the
+ * child paths.
+ */
Can we use get_cheapest_parallel_safe_total_inner for this, from
a71f10189dc10a2fe422158a2c9409e0f77c6b9e?
+ if (rel->partial_pathlist != NIL &&
+ (Path *) linitial(rel->partial_pathlist) == subpath)
+ partial_subplans_set = bms_add_member(partial_subplans_set, i);
This seems like a scary way to figure this out. What if we wanted to
build a parallel append subpath with some path other than the
cheapest, for some reason? I think you ought to record the decision
that set_append_rel_pathlist makes about whether to use a partial path
or a parallel-safe path, and then just copy it over here.
- create_append_path(grouped_rel,
- paths,
- NULL,
- 0);
+ create_append_path(grouped_rel, paths, NULL, 0);
Unnecessary.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-03-09 01:55:15 | Re: Cost model for parallel CREATE INDEX |
Previous Message | Peter Geoghegan | 2017-03-09 01:45:33 | Re: Cost model for parallel CREATE INDEX |