From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fix incorrect start up costs for WindowAgg paths (bug #17862) |
Date: | 2023-04-13 11:51:12 |
Message-ID: | CAApHDvrdw8fc8GA_RD5X8u6beSe_7jfMPZ4TOf6FLRQymk3HXQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 13 Apr 2023 at 10:09, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I also see I might need to do a bit more work on this as the following
> is not handled correctly:
>
> select count(*) over(rows between unbounded preceding and 10
> following) from tenk1;
>
> it's assuming all rows due to lack of ORDER BY, but it seems like it
> should be 10 rows due to the 10 FOLLOWING end bound.
Well, as it turned out, it was quite a bit more work. The frame
options have had quite a few additions since I last looked in detail.
I've attached v2 of the patch. I've included a DEBUG1 message which
is useful to check what the estimate comes out as without having to
have a debugger attached all the time.
Here are a few samples of the estimator getting things right:
# select count(*) over (order by four range between unbounded
preceding and 2 following exclude current row) from tenk1 limit 1;
DEBUG: startup_tuples = 7499
count
-------
7499
# select count(*) over (order by four rows between unbounded preceding
and 4000 following) from tenk1 limit 1;
DEBUG: startup_tuples = 4001
count
-------
4001
# select count(*) over (order by four rows between unbounded preceding
and 4000 following exclude group) from tenk1 limit 1;
DEBUG: startup_tuples = 1501
count
-------
1501
You can see in each case, startup_tuples was estimated correctly as
confirmed by count(*) during execution.
I've attached some more of these in sample_tests.txt, which all are
correct with the caveat of get_windowclause_startup_tuples() never
returning 0 due to it using clamp_row_est(). In practice, that's a
non-issue due to the way the startup_tuples value is used to calculate
the startup costs.
David
Attachment | Content-Type | Size |
---|---|---|
sample_tests.txt | text/plain | 2.6 KB |
fix_bug_17862_v2.patch | application/octet-stream | 13.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-04-13 12:24:20 | Re: Protecting allocator headers with Valgrind |
Previous Message | Ranier Vilela | 2023-04-13 11:42:46 | Re: Bufmgr possible overflow |