From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Hybrid Hash/Nested Loop joins and caching results from subplans |
Date: | 2020-08-19 02:45:55 |
Message-ID: | CAApHDvoXeVizPh_J0Ystw14Z8LpyTz_7bZxisqGS6ucHtAruZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 25 May 2020 at 19:53, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I didn't quite get the LATERAL support quite done in the version I
> sent. For now, I'm not considering adding a Result Cache node if there
> are lateral vars in any location other than the inner side of the
> nested loop join. I think it'll just be a few lines to make it work
> though. I wanted to get some feedback before going to too much more
> trouble to make all cases work.
I've now changed the patch so that it supports adding a Result Cache
node to LATERAL joins.
e.g.
regression=# explain analyze select count(*) from tenk1 t1, lateral
(select x from generate_Series(1,t1.twenty) x) gs;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=150777.53..150777.54 rows=1 width=8) (actual
time=22.191..22.191 rows=1 loops=1)
-> Nested Loop (cost=0.01..125777.53 rows=10000000 width=0)
(actual time=0.010..16.980 rows=95000 loops=1)
-> Seq Scan on tenk1 t1 (cost=0.00..445.00 rows=10000
width=4) (actual time=0.003..0.866 rows=10000 loops=1)
-> Result Cache (cost=0.01..10.01 rows=1000 width=0)
(actual time=0.000..0.001 rows=10 loops=10000)
Cache Key: t1.twenty
Hits: 9980 Misses: 20 Evictions: 0 Overflows: 0
-> Function Scan on generate_series x
(cost=0.00..10.00 rows=1000 width=0) (actual time=0.001..0.002 rows=10
loops=20)
Planning Time: 0.046 ms
Execution Time: 22.208 ms
(9 rows)
Time: 22.704 ms
regression=# set enable_resultcache=0;
SET
Time: 0.367 ms
regression=# explain analyze select count(*) from tenk1 t1, lateral
(select x from generate_Series(1,t1.twenty) x) gs;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=225445.00..225445.01 rows=1 width=8) (actual
time=35.578..35.579 rows=1 loops=1)
-> Nested Loop (cost=0.00..200445.00 rows=10000000 width=0)
(actual time=0.008..30.196 rows=95000 loops=1)
-> Seq Scan on tenk1 t1 (cost=0.00..445.00 rows=10000
width=4) (actual time=0.002..0.905 rows=10000 loops=1)
-> Function Scan on generate_series x (cost=0.00..10.00
rows=1000 width=0) (actual time=0.001..0.002 rows=10 loops=10000)
Planning Time: 0.031 ms
Execution Time: 35.590 ms
(6 rows)
Time: 36.027 ms
v7 patch series attached.
I also modified the 0002 patch so instead of modifying simplehash.h's
SH_DELETE function to have it call SH_LOOKUP and the newly added
SH_DELETE_ITEM function, I've just added an entirely new
SH_DELETE_ITEM and left SH_DELETE untouched. Trying to remove the
code duplication without having a negative effect on performance was
tricky and it didn't save enough code to seem worthwhile enough.
I also did a round of polishing work, fixed a spelling mistake in a
comment and reworded a few other comments to make some meaning more
clear.
David
Attachment | Content-Type | Size |
---|---|---|
v7-0001-Allow-estimate_num_groups-to-pass-back-further-de.patch | application/octet-stream | 8.8 KB |
v7-0002-Allow-users-of-simplehash.h-to-perform-direct-del.patch | application/octet-stream | 3.5 KB |
v7-0003-Add-Result-Cache-executor-node.patch | application/octet-stream | 168.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-08-19 03:27:06 | Re: [PG13] Planning (time + buffers) data structure in explain plan (format text) |
Previous Message | Andres Freund | 2020-08-19 02:34:00 | Re: prepared transaction isolation tests |