From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Tim Palmer <tim3sp(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17862: Overall query cost ignores window function |
Date: | 2023-03-25 01:40:00 |
Message-ID: | CAApHDvrwTh9nUfZOp2WQr8VYjKZwg6eS2g2t2K_D_c6rTS+How@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, 24 Mar 2023 at 03:41, Tim Palmer <tim3sp(at)gmail(dot)com> wrote:
>
> On Wed, 22 Mar 2023 at 21:03, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > It likely would be possible to adjust cost_windowagg() to figure out a
> > startup_cost for getting the first row from a WindowFunc. Doing so
> > would require looking at the frame options and trying to figure out
> > how many rows need to be looked at. If you'd written count(*) OVER
> > (rows between current row and 10 following) then we'd only need to
> > look forward 10 rows from the current row.
>
> The original query is attempting to retrieve one page of data and
> (simultaneously) the total number of rows available to be paged through,
> like the suggestion here https://stackoverflow.com/a/28888696.
> A frame_end of '10 following' wouldn't be useful for that, but there's
> probably no point in counting the exact row count beyond some limit
> so potentially it could use something like '500 following'.
I was just using the 10 following as an example of the fact that
WindowAgg does not always look at the entire partition to get the
first row.
> In the specific case of an empty OVER () clause, could the startup cost
> be set to the total cost of the child plan node?
It would likely improve your case and may be better in general, but
that would pretty much forego the planning making use of an cheap
startup plan when the WindowAgg just needed to look at a small number
of rows to fetch the first row. I highly doubt doing that would be
acceptable alone.
I think what it would take would be to look at each WindowFunc and
then estimate how many rows we need to look at for each WindowClause
and then take the Max of that number of rows and calculate the
proportion of the subpath's total cost we need to use based on the
number of estimated rows from the subpath. I think the total subpath
rows would need to be divided by the ndistinct estimate for the
PARTITION BY clause too.
David
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-03-25 05:32:16 | Re: BUG #17855: Uninitialised memory used when the name type value processed in binary mode of Memoize |
Previous Message | Andres Freund | 2023-03-24 21:27:17 | Re: BUG #17863: Unable to restore dump 12.12 -> 15.2 |