From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: initial pruning in parallel append |
Date: | 2023-08-08 14:16:07 |
Message-ID: | CA+Tgmob9gZZa6BVcxYAGxRjoV55+8oVLWTUbB1fE+qv48YML2A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 8, 2023 at 2:58 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > That doesn't seem like a big problem because there aren't many node
> > types that do pruning, right? I think we're just talking about Append
> > and MergeAppend, or something like that, right?
>
> MergeAppend can't be parallel-aware atm, so only Append.
Well, the question isn't whether it's parallel-aware, but whether
startup-time pruning happens there.
> So change the ordering of the following code in ParallelQueryMain():
Yeah, that would be a reasonable thing to do.
> Or we could consider something like the patch I mentioned in my 1st
> email. The idea there was to pass the pruning result via a separate
> channel, not the DSM chunk linked into the PlanState tree. To wit, on
> the leader side, ExecInitParallelPlan() puts the serialized
> List-of-Bitmapset into the shm_toc with a dedicated PARALLEL_KEY,
> alongside PlannedStmt, ParamListInfo, etc. The List-of-Bitmpaset is
> initialized during the leader's ExecInitNode(). On the worker side,
> ExecParallelGetQueryDesc() reads the List-of-Bitmapset string and puts
> the resulting node into the QueryDesc, that ParallelQueryMain() then
> uses to do ExecutorStart() which copies the pointer to
> EState.es_part_prune_results. ExecInitAppend() consults
> EState.es_part_prune_results and uses the Bitmapset from there, if
> present, instead of performing initial pruning.
This also seems reasonable.
> I'm assuming it's not
> too ugly if ExecInitAppend() uses IsParallelWorker() to decide whether
> it should be writing to EState.es_part_prune_results or reading from
> it -- the former if in the leader and the latter in a worker.
I don't think that's too ugly. I mean you have to have an if statement
someplace.
> If we
> are to go with this approach we will need to un-revert ec386948948c,
> which moved PartitionPruneInfo nodes out of Append/MergeAppend nodes
> to a List in PlannedStmt (copied into EState.es_part_prune_infos),
> such that es_part_prune_results mirrors es_part_prune_infos.
The comment for the revert (which was
5472743d9e8583638a897b47558066167cc14583) points to
https://www.postgresql.org/message-id/4191508.1674157166@sss.pgh.pa.us
as the reason, but it's not very clear to me why that email led to
this being reverted. In any event, I agree that if we go with your
idea to pass this via a separate PARALLEL_KEY, unreverting that patch
seems to make sense, because otherwise I think we don't have a fast
way to find the nodes that contain the state that we care about.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2023-08-08 14:28:49 | Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc? |
Previous Message | Ashutosh Bapat | 2023-08-08 14:14:50 | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |