Re: Assert failure on running a completed portal again

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assert failure on running a completed portal again
Date: 2024-12-10 17:14:50
Message-ID: 77955.1733850890@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Dec 5, 2024 at 8:09 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> After looking at this further, I think this whole "run_once"
>> business is badly designed, worse documented, and redundant
>> (as well as buggy). It can be replaced with three self-contained
>> lines in ExecutePlan, as per the attached.

> Did you look at the commit message for
> 691b8d59281b5177f16fe80858df921f77a8e955? IMHO, that's pretty clear
> about what the goals of this were.

Well, it's not very clear about what implementation limitations we
are trying to protect.

> I haven't tested, but I'm guessing that what happens in Nagata-san's
> case with the committed fix is that we execute the plan once to
> completion in parallel mode; and then the second execute message
> executes the same plan tree again in non-parallel mode but this time
> it ends up returning zero tuples because it's already been executed to
> completion. If that is correct, then I think this is subtly changing
> the rules for parallel Plan trees: before, the rule as I understood
> it, was "only complete execution must supported" but now it's
> "complete executed must be supported and, in addition, a no-op
> re-execution after complete execution must be supported". Off the top
> of my head that seems like it should be fine.

Hmm. As committed, the re-execution will happen with
use_parallel_mode disabled, which might make that safer, or maybe it's
less safe? Do you think we need something closer to the upthread
proposal to not run the executor at all during the "re-execution"?
(I'd still be inclined to put that short-circuit into ExecutePlan
not anything higher-level.)

My recollection is that even before parallelism we had some plan node
types that didn't cope well with being called again after they'd once
returned EOF (ie a null tuple). So maybe a defense against trying to
do that would be wise. It could probably be as simple as checking
estate->es_finished.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-12-10 17:36:40 Re: FileFallocate misbehaving on XFS
Previous Message Robert Haas 2024-12-10 16:52:34 Re: Assert failure on running a completed portal again