From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Hybrid Hash/Nested Loop joins and caching results from subplans |
Date: | 2020-06-02 20:54:41 |
Message-ID: | CAApHDvorfu97V1G=+WvwzdWwO7XjFERqHg55umqF6nvh2kZJQQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2 Jun 2020 at 21:05, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> Today I tested the correctness & performance of this patch based on TPC-H
> workload, the environment is setup based on [1]. Correctness is tested by
> storing the result into another table when this feature is not introduced and
> then enable this feature and comparing the result with the original ones. No
> issue is found at this stage.
Thank you for testing it out.
> I also checked the performance gain for TPC-H workload, totally 4 out of the 22
> queries uses this new path, 3 of them are subplan, 1 of them is nestloop. All of
> changes gets a better result. You can check the attachments for reference.
> normal.log is the data without this feature, patched.log is the data with the
> feature. The data doesn't show the 10x performance gain, I think that's mainly
> data size related.
Thanks for running those tests. I had a quick look at the results and
I think to say that all 4 are better is not quite right. One is
actually a tiny bit slower and one is only faster due to a plan
change. Here's my full analysis.
Q2 uses a result cache for the subplan and has about a 37.5% hit ratio
which reduces the execution time of the query down to 67% of the
original.
Q17 uses a result cache for the subplan and has about a 96.5% hit
ratio which reduces the execution time of the query down to 24% of the
original time.
Q18 uses a result cache for 2 x nested loop joins and has a 0% hit
ratio. The execution time is reduced to 91% of the original time only
because the planner uses a different plan, which just happens to be
faster by chance.
Q20 uses a result cache for the subplan and has a 0% hit ratio. The
execution time is 100.27% of the original time. There are 8620 cache
misses.
All other queries use the same plan with and without the patch.
> At the code level, I mainly checked nestloop path and cost_resultcache_rescan,
> everything looks good to me. I'd like to check the other parts in the following days.
Great.
> [1] https://ankane.org/tpc-h
From | Date | Subject | |
---|---|---|---|
Next Message | Vik Fearing | 2020-06-02 21:58:51 | Re: Default gucs for EXPLAIN |
Previous Message | Bruce Momjian | 2020-06-02 20:51:16 | Re: Default gucs for EXPLAIN |