Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'
Date: 2025-02-19 10:00:51
Message-ID: CAPpHfduirKBHvBuy-ZAht5fxv++jCUeToHnRWxuGHmzAmcb54A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 18, 2025 at 2:52 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> On 17/2/2025 02:06, Alexander Korotkov wrote:
> > On Thu, Nov 28, 2024 at 4:39 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> >> Here we also could count number of scanned NULLs separately in
> >> vardata_extra and use it in upper GROUP-BY estimation.
> >
> > What could be the type of vardata_extra? And what information could
> > it store? Yet seems too sketchy for me to understand.
> It is actually sketchy. Our estimation routines have no information
> about intermediate modifications of the data. Left-join generated NULLs
> is a good example here. So, my vague idea is to maintain that info and
> change statistical estimations somehow.
> Of course, it is out of the scope here.
> >
> > But, I think for now we should go with the original patch. It seems
> > to be quite straightforward extension to what 4767bc8ff2 does. I've
> > revised commit message and applied pg_indent to sources. I'm going to
> > push this if no objections.
> Ok, I added one regression test to check that feature works properly.

Andrei, thank you. I've pushed the patch applying some simplification
of regression test.

------
Regards,
Alexander Korotkov
Supabase

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-02-19 10:13:30 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Bertrand Drouvot 2025-02-19 09:56:18 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart