Quick Links

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc:	PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)
Date:	2023-04-12 22:09:39
Message-ID:	CAApHDvqFTr=vORoBFgK-=7M0SpCT2n+Y3QZJJPUB5g_SC8ijjw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

.On Thu, 13 Apr 2023 at 02:28, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> The concept of startup_tuples for a WindowAgg looks good to me, but I
> can't follow up with the below line:
>
> + return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL);
>
> # select count(*) over() from tenk1 limit 1;
> count
> -------
> 10000 --> We need to scan all the tuples.
>
> Should we just return clamp_row_est(partition_tuples)?

For the case you've shown, it will. It's handled by this code:

if (wc->orderClause == NIL)
return clamp_row_est(partition_tuples);

It would take something like the following to hit the code you're
concerned about:

explain select count(*) over(order by unique1 rows between unbounded
preceding and 10*random() following) from tenk1;
QUERY PLAN
--------------------------------------------------------------------------------------------
WindowAgg (cost=140.23..420.29 rows=10000 width=12)
-> Index Only Scan using tenk1_unique1 on tenk1
(cost=0.29..270.29 rows=10000 width=4)
(2 rows)

You can see the startup cost is about 33% of the total cost for that,
which is from the DEFAULT_INEQ_SEL. I'm not exactly set on that
having to be DEFAULT_INEQ_SEL, but I'm not really sure what we could
put that's better. I don't really follow why assuming all rows are
required is better. That'll just mean we favour cheap startup plans
less, but there might be a case where a cheap startup plan is
favourable. I was opting for a happy medium when I thought to use
DEFAULT_INEQ_SEL.

I also see I might need to do a bit more work on this as the following
is not handled correctly:

select count(*) over(rows between unbounded preceding and 10
following) from tenk1;

it's assuming all rows due to lack of ORDER BY, but it seems like it
should be 10 rows due to the 10 FOLLOWING end bound.

David

In response to

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862) at 2023-04-12 14:28:34 from Andy Fan

Responses

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862) at 2023-04-13 00:16:21 from Andy Fan
Re: Fix incorrect start up costs for WindowAgg paths (bug #17862) at 2023-04-13 11:51:12 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2023-04-12 22:23:48	Re: Add LZ4 compression in pg_dump
Previous Message	Michael Paquier	2023-04-12 21:49:16	Re: [PATCH] Allow Postgres to pick an unused port to listen