Quick Links

Observations in Parallel Append

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Observations in Parallel Append
Date:	2017-12-22 14:18:08
Message-ID:	CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Few observations in Parallel Append commit (ab727167)

1.
+++ b/src/include/nodes/execnodes.h
@@ -21,6 +21,7 @@
#include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
+#include "storage/spin.h"
..

There doesn't seem to be any need for including spin.h. I think some
prior version of the patch might need it. Patch attached to remove
it.

2.
+ * choose_next_subplan_for_worker
..
+ * We start from the first plan and advance through the list;
+ * when we get back to the end, we loop back to the first
+ * nonpartial plan.
..
+choose_next_subplan_for_worker(AppendState *node)
{
..
+ if (pstate->pa_next_plan < node->as_nplans - 1)
+ {
+ /* Advance to next plan. */
+ pstate->pa_next_plan++;
+ }
+ else if (append->first_partial_plan < node->as_nplans)
+ {
+ /* Loop back to first partial plan. */
+ pstate->pa_next_plan = append->first_partial_plan;
+ }
..

The code and comment don't seem to match. The comments indicate that
after reaching the end of the list, we loop back to first nonpartial
plan whereas code indicates that we loop back to first partial plan.
I think one of those needs to be changed unless I am missing something
obvious.

3.
+cost_append(AppendPath *apath)
{
..
+ /*
+ * Apply parallel divisor to non-partial subpaths. Also add the
+ * cost of partial paths to the total cost, but ignore non-partial
+ * paths for now.
+ */
+ if (i < apath->first_partial_path)
+ apath->path.rows += subpath->rows / parallel_divisor;
+ else
+ {
+ apath->path.rows += subpath->rows;
+ apath->path.total_cost += subpath->total_cost;
+ }
..
}

I think it is better to use clamp_row_est for rows for the case where
we use parallel_divisor so that the value of rows is always sane.
Also, don't we need to use parallel_divisor for partial paths instead
of non-partial paths as those will be actually distributed among
workers?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
spurious_inclusion_v1.patch	application/octet-stream	405 bytes

Responses

Re: Observations in Parallel Append at 2017-12-24 06:36:08 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Oliver Ford	2017-12-22 14:28:52	Re: Add RANGE with values and exclusions clauses to the Window Functions
Previous Message	Pavel Stehule	2017-12-22 14:02:23	Re: Using ProcSignal to get memory context stats from a running backend