From: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fix BUG #17335: Duplicate result rows in Gather node |
Date: | 2022-01-23 11:56:46 |
Message-ID: | 382c3b52071a087fd4357e22959b3ddb05e524b7.camel@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> > Suggested quick (and valid) fix in the patch attached:
> > - If Append has single child, then copy its parallel awareness.
>
> I've been looking at this and I've gone through changing my mind about
> what's the right fix quite a number of times.
>
> My current thoughts are that I don't really like the fact that we can
> have plans in the following shape:
>
> Finalize Aggregate
> -> Gather
> Workers Planned: 1
> -> Partial Aggregate
> -> Parallel Hash Left Join
> Hash Cond: (gather_append_1.fk = gather_append_2.fk)
> -> Index Scan using gather_append_1_ix on gather_append_1
> Index Cond: (f = true)
> -> Parallel Hash
> -> Parallel Seq Scan on gather_append_2
>
> It's only made safe by the fact that Gather will only use 1 worker.
> To me, it just seems too fragile to assume that's always going to be
> the case. I feel like this fix just relies on the fact that
> create_gather_path() and create_gather_merge_path() do
> "pathnode->num_workers = subpath->parallel_workers;". If someone
> decided that was to work a different way, then we risk this breaking
> again. Additionally, today we have Gather and GatherMerge, but we may
> one day end up with more node types that gather results from parallel
> workers, or even a completely different way of executing plans.
It seems strange parallel_aware and parallel_safe flags neither affect
execution nor are properly checked.
Except parallel_safe is checked in ExecSerializePlan which is called from
ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
But looks like this check doesn't affect execution as well.
>
> I think a safer way to fix this is to just not remove the
> Append/MergeAppend node if the parallel_aware flag of the only-child
> and the Append/MergeAppend don't match. I've done that in the
> attached.
>
> I believe the code at the end of add_paths_to_append_rel() can remain as is.
I found clean_up_removed_plan_level also called from set_subqueryscan_references.
Is there a need to patch there as well?
And there is strange state:
- in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
all its subpath's parallel_safe
(therefore there were need to copy it in my patch version),
- that means, our AppendPath is parallel_aware but not parallel_safe.
It is ridiculous a bit.
And it is strange AppendPath could have more parallel_workers than sum of
its children parallel_workers.
So it looks like whole machinery around parallel_aware/parallel_safe has
no enough consistency.
Either way, I attach you version of fix with my tests as new patch version.
regards,
Yura Sokolov
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Fix-duplicate-result-rows-after-Append-path-remov.patch | text/x-patch | 11.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Marcos Pegoraro | 2022-01-23 14:00:12 | current_schema will not use an text index ? |
Previous Message | Trevor Gross | 2022-01-23 10:56:11 | Re: WIP: System Versioned Temporal Table |