Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-09-18 10:55:12
Message-ID: CAA4eK1J5UPD=qj8vJNdCvoZ48pKWj531o_z9z2YinAN2jo1a4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 17, 2015 at 2:28 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Sep 3, 2015 at 6:21 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> + /*
> + * We expect each worker to populate the BufferUsage structure
> + * allocated by master backend and then master backend will aggregate
> + * all the usage along with it's own, so account it for each worker.
> + */
>
> This also needs improvement. Especially because...
>
> + /*
> + * We expect each worker to populate the instrumentation structure
> + * allocated by master backend and then master backend will aggregate
> + * all the information, so account it for each worker.
> + */
>
> ...it's almost identical to this one.
>

I think we can combine them and have one comment.

>
> +GetParallelSupportInfo(shm_toc *toc, ParamListInfo *params,
>
> Could this be a static function? Will it really be needed outside this
file?
>

It is already declared as static, but will add static in function definition
as well.

> And is there any use case for letting some of the arguments be NULL?

In earlier versions of patch this API was used from other places, but now
there is no such use, so will change accordingly.

>
> +bool
> +ExecParallelBufferUsageAccum(Node *node)
> +{
> + if (node == NULL)
> + return false;
> +
> + switch (nodeTag(node))
> + {
> + case T_FunnelState:
> + {
> + FinishParallelSetupAndAccumStats((FunnelState*)node);
> + return true;
> + }
> + break;
> + default:
> + break;
> + }
> +
> + (void) planstate_tree_walker((Node*)((PlanState *)node)->lefttree,
NULL,
> + ExecParallelBufferUsageAccum, 0);
> + (void) planstate_tree_walker((Node*)((PlanState *)node)->righttree,
NULL,
> + ExecParallelBufferUsageAccum, 0);
> + return false;
> +}
>
> This seems wacky. I mean, isn't the point of planstate_tree_walker()
> that the callback itself doesn't have to handle recursion like this?
> And if not, then this wouldn't be adequate anyway, because some
> planstate nodes have children that are not in lefttree or righttree
> (cf. explain.c).
>

Will change according to recent commit for planstate_tree_walker

> + currentRelation = ExecOpenScanRelation(estate,
> + ((SeqScan *)
> node->ss.ps.plan)->scanrelid,
> + eflags);
>
> I can't see how this can possibly be remotely correct. The funnel
> node shouldn't be limited to scanning a baserel (cf. fdw_scan_tlist).
>

This is mainly used for generating tuple descriptor and that tuple
descriptor will be used for forming scanslot, funnel node itself won't
do any scan. However, we can completely eliminate this InitFunnel()
function and use ExecAssignProjectionInfo() instead of
ExecAssignScanProjectionInfo() to form the projection info.

>
> + buffer_usage_worker = (BufferUsage *)(buffer_usage + (i *
> sizeof(BufferUsage)));
>
> Cast it to a BufferUsage * first. Then you can use &foo[i] to find
> the i'th element.
>

Do you mean to say that the way code is written won't work?
Values of structure BufferUsage for each worker is copied into string
buffer_usage which I believe could be fetched in above way.

> + /*
> + * Re-initialize the parallel context and workers to perform
> + * rescan of relation. We want to gracefully shutdown all the
> + * workers so that they should be able to propagate any error
> + * or other information to master backend before dying.
> + */
> + FinishParallelSetupAndAccumStats(node);
>
> Somehow, this makes me feel like that function is badly named.
>

I think here comment seems to be slightly misleading, shall we
change the comment as below:

Destroy the parallel context to gracefully shutdown all the
workers so that they should be able to propagate any error
or other information to master backend before dying.

> +/*
> + * _readPlanInvalItem
> + */
> +static PlanInvalItem *
> +_readPlanInvalItem(void)
> +{
> + READ_LOCALS(PlanInvalItem);
> +
> + READ_INT_FIELD(cacheId);
> + READ_UINT_FIELD(hashValue);
> +
> + READ_DONE();
> +}
>
> I don't see why we should need to be able to copy PlanInvalItems. In
> fact, it seems like a bad idea.
>

We are not copying PlanInvalItems, so I don't think this is required.
Now I don't exactly remember why I have added this at first place,
one reason could be that in earlier versions PlanInvalItems might
have been copied. Anyway, I will verify it once more and if not
required, I will remove it.

> +#parallel_setup_cost = 0.0 # same scale as above
> +#define DEFAULT_PARALLEL_SETUP_COST 0.0
>
> This value is probably a bit on the low side.
>

How about keeping it as 10.0?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-09-18 10:59:40 Re: On-demand running query plans using auto_explain and signals
Previous Message Thomas Munro 2015-09-18 10:24:32 Re: synchronous_commit = apply