From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15383: Join Filter cost estimation problem in 10.5 |
Date: | 2020-04-24 06:26:49 |
Message-ID: | CAApHDvq-e90W0JD1q9U9RPquXq0AZ5pUz2BN8Zxch-QBhJ5ZFQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Sun, 1 Dec 2019 at 06:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> >>> One thing we could look at would be to charge an additional
> >>> cpu_tuple_cost per outer row for all joins except semi, anti and
> >>> unique joins. This would account for the additional lookup for
> >>> another matching row which won't be found and cause the planner to
> >>> slightly favour keeping the unique rel on the inner side of the join,
> >>> when everything else is equal.
>
> which'd help break ties in the right direction. It's a bit scary to
> be fixing this issue by changing the cost estimates for non-unique
> joins --- that could have side-effects we don't want. But arguably,
> the above is a correct refinement to the cost model, so maybe it's
> okay.
I wanted to see just how much fallout there'd be in the regression
tests if we did add the small non-unique join surcharge to all joins
which are unable to skip to the next outer tuple after matching the
first inner one. I went and added a cpu_tuple_cost per row to try to
account for the additional cost during execution. In the simple test
that I showed in April last year on this thread, it now does always
prefer to keep the unique rel on the inner side of the join with all 3
join types. However, there's quite a bit of fallout in the regression
tests. Mostly around changes in join order, but I see there's also a
weird failure in join.out on a test that claims that it shouldn't
allow a unique join. The output of that has changed to actually
performing a unique join. The test looks to be broken since the qual
comparing to the cost should allow detection of unique joins.
Perhaps 1 cpu_tuple_cost per output tuple is far too big a surcharge.
For hash join with the example case from April 2019, the surcharge
adds about 18% to the overall cost of the join. 552.50 up to 652.50.
The attached patch is Tom's patch from this thread from March last
year minus his regression test changes plus my join surcharge stuff.
This is just intended as a topic of conversation at the moment and
does not make any adjustments to the expected test outputs.
David
Attachment | Content-Type | Size |
---|---|---|
add_non-unique_join_surcharge_experiment.patch | application/octet-stream | 4.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-04-24 07:05:10 | Re: [BUG] non archived WAL removed during production crash recovery |
Previous Message | Michael Paquier | 2020-04-24 03:43:51 | Re: [BUG] non archived WAL removed during production crash recovery |
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2020-04-24 06:36:51 | Re: 2pc leaks fds |
Previous Message | Dilip Kumar | 2020-04-24 06:24:46 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |