From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Richard Guo <guofenglinux(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Emre Hasegeli <emre(at)hasegeli(dot)com> |
Subject: | Re: Using each rel as both outer and inner for JOIN_ANTI |
Date: | 2023-04-06 00:17:37 |
Message-ID: | CA+hUKGJPJT-bpcJjVEmmo1pSDjW_jUR9OWsnKZW1xAY6JbTn_A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Apr 6, 2023 at 9:11 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> > Thanks for reminding. Attached is the rebased patch, with no other
> > changes. I think the patch is ready for commit.
>
> Pushed after a little further fooling with the comments. I also had
> to rebase it over 11c2d6fdf (Parallel Hash Full Join). I think I did
> that correctly, but it's not clear to me whether any of the existing
> test cases are now doing parallelized hashed right antijoins. Might
> be worth a little more testing.
I don't see any (at least that are EXPLAINed). Wondering if we should
add some of these into join_hash.sql, but probably not before I figure
out how to make that whole file run faster...
> I think that Alvaro's concern about incorrect cost estimates may be
> misplaced. I couldn't find any obvious errors in the costing logic for
> this, given that we concluded that the early-exit runtime logic cannot
> apply. Also, when I try simply executing Richard's original test query
> (in a non-JIT build), the runtimes I get line up quite well ... maybe
> too well? ... with the cost estimates:
>
> v15 HEAD w/patch Ratio
>
> Cost estimate 173691.19 90875.33 0.52
> Actual (best of 3) 514.200 ms 268.978 ms 0.52
>
> I think the smaller differentials you guys were seeing were all about
> EXPLAIN ANALYZE overhead.
I tried the original example from the top of this thread and saw a
decent speedup from parallelism, but only if I set
min_parallel_table_scan_size=0, and otherwise it doesn't choose
Parallel Hash Right Anti Join. Same if I embiggen bar significantly.
Haven't looked yet but I wonder if there is some left/right confusion
on parallel degree computation or something like that...
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-04-06 00:34:45 | Re: recovery modules |
Previous Message | Alvaro Herrera | 2023-04-05 23:33:56 | Re: cataloguing NOT NULL constraints |