Re: Potential Issue with Redundant Restriction Clauses in get_parameterized_baserel_size for PARTITIONED_REL

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: huyajun <hu_yajun(at)qq(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, feichanghong(at)qq(dot)com
Subject: Re: Potential Issue with Redundant Restriction Clauses in get_parameterized_baserel_size for PARTITIONED_REL
Date: 2024-11-27 02:10:07
Message-ID: CAApHDvrSjLUB3mKZyxvgo648sBP=9N8HQKg_sOw3cq0QKAW2oA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 27 Nov 2024 at 02:20, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> In this regard, it seems that get_parameterized_baserel_size() does
> not do anything wrong.
>
> I think it might be better to modify set_append_rel_size() to set an
> appendrel's tuples to the sum of the tuples from each live child,
> rather than to its rows. This would also help us adjust the estimate
> for the number of distinct values in estimate_num_groups() for
> appendrels using the new formula introduced in 84f9a35e3. There were
> discussions as well as a patch for this about one year ago. Please
> see [1].

It would be good to understand why get_parameterized_baserel_size()
bothers accounting for the baserestrictinfo quals and does not just do
clauselist_selectivity() on param_clauses alone and multiply by
rel->rows (which should already account for the baserestrictinfo). The
varRelId being 0 shouldn't matter as the baserestrictinfo obviously
won't contain join quals. The "set_baserel_size_estimates must have
been applied already." comment does not seem to be true, so it kinda
does look like something is wrong here. If it was possible to do it
this way, it would be a bit more efficient too as it saves making
another List and saves rechecking the selectivity of the
baserestictinfo quals.

I tried the attached quick hack to see if anything in the regression
tests came up with a different answer when the function was coded as
described above. I didn't spend the time to see which answer is
correct, but it does both tests that raise a notice are UNION ALL
subqueries. e.g:

explain (costs off)
select * from
(select 0 as z) as t1
left join
(select true as a) as t2
on true,
lateral (select true as b
union all
select a as b) as t3
where b;

which gives:

NOTICE: nrows = 1, nrows2 = 2

I didn't develop an opinion on what set_append_rel_size() sets rel->tuples to.

David

Attachment Content-Type Size
get_parameterized_baserel_size.patch application/octet-stream 1.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-11-27 02:26:15 Re: Potential Issue with Redundant Restriction Clauses in get_parameterized_baserel_size for PARTITIONED_REL
Previous Message Masahiko Sawada 2024-11-27 01:38:20 Re: Count and log pages set all-frozen by vacuum