Re: Postgres 16.1 - Bug: cache entry already complete

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amadeo Gallardo <amadeo(at)ruddr(dot)io>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres 16.1 - Bug: cache entry already complete
Date: 2024-01-11 03:26:43
Message-ID: CAMbWs4-522XgUyK3BR3S6U-yiY36wLv7acP9Bf6mJS0zkXJDNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jan 4, 2024 at 8:04 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Thu, 4 Jan 2024 at 00:01, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > if (extra->inner_unique &&
> > (inner_path->param_info == NULL ||
> > list_length(inner_path->param_info->ppi_clauses) <
> > list_length(extra->restrictlist)))
> > return NULL;
>
> I see. So it worked fine until Tom's " Make Vars be outer-join-aware."
> stuff.
>
> 9e215378d fixed this already and looks like I came to the same
> conclusion about the cache completion issue back then too.

Exactly.

> > But why does the code above fail to capture the problematic query? It
> > turns out that inner_path->param_info->ppi_clauses contains cloned
> > clauses, i.e. multiple versions of the same clause in order to cope with
> > outer join identity 3. And this causes the above code to give incorrect
> > conclusions about the length of 'ppi_clauses'. And this also explains
> > why this issue only occurs in v16.
> >
> > As a clause's serial number is unique within the current PlannerInfo
> > context, and multiple clones of the same clause have the same serial
> > number, it seems to me that it's more correct to calculate the length of
> > ppi_clauses by:
> >
> > bms_num_members(inner_path->param_info->ppi_serials)
> >
> > So I think maybe we can fix this issue with the attached patch.
>
> hmm, I'd need to study what Tom's been changing around this. I don't
> have a good handle on which qual lists can have duplicated quals.
>
> In [1], you can see I took the inspiration for that fix from
> generate_mergejoin_paths(). I don't yet know if it's just ppi_clauses
> that can have duplicates or if there's a similar hazard in the
> list_length() checks in generate_mergejoin_paths()

AFAIU we may have clone clauses in RelOptInfo.joininfo to cope with
commuted-left-join cases per identity 3. When we build ParamPathInfo
for parameterized paths we may move join clauses to ppi_clauses when
possible, so ParamPathInfo.ppi_clauses may also have clone clauses.

When we construct restrict list for a join relation from the joininfo
lists of the relations it joins, only one of the clone clauses would be
chosen to be put in the restrict list if they are restriction clauses,
thanks to the check against required_relids and incompatible_relids.

if (rinfo->has_clone || rinfo->is_clone)
{
Assert(!RINFO_IS_PUSHED_DOWN(rinfo, joinrel->relids));
if (!bms_is_subset(rinfo->required_relids, both_input_relids))
continue;
if (bms_overlap(rinfo->incompatible_relids, both_input_relids))
continue;
}

The mergeclause_list examined in generate_mergejoin_paths() is extracted
from joinrel's restrictlist, so it will not have multiple versions for
one clause. Therefore, I think we're good with the list_length() checks
in generate_mergejoin_paths().

> > BTW, the SingleRow in explain seems quite handy. I wonder if we can
> > keep it.
>
> If the fix is to just disable Memoize when the ppi_clauses don't cover
> the entire join condition, then the Inner Unique from the parent
> nested loop tells us this. Isn't that enough? The reason I added
> SingleRow was because in the patch I posted, I'd made it so SingleRow
> could be false in cases where Inner Unique was true.

Ah yes, you're right.

Thanks
Richard

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2024-01-11 06:47:43 BUG #18283: vacuum full use a large amount of memory (may cause OOM)
Previous Message Alexander Lakhin 2024-01-11 03:00:00 Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger