From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Subject: | Re: explain HashAggregate to report bucket and memory stats |
Date: | 2020-02-16 00:02:20 |
Message-ID: | 20200216000220.GF31889@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 03, 2020 at 06:53:01AM -0800, Andres Freund wrote:
> On 2020-01-03 10:19:26 -0600, Justin Pryzby wrote:
> > On Sun, Feb 17, 2019 at 11:29:56AM -0500, Jeff Janes wrote:
> > https://www.postgresql.org/message-id/CAMkU%3D1zBJNVo2DGYBgLJqpu8fyjCE_ys%2Bmsr6pOEoiwA7y5jrA%40mail.gmail.com
> > > What would I find very useful is [...] if the HashAggregate node under
> > > "explain analyze" would report memory and bucket stats; and if the Aggregate
> > > node would report...anything.
>
> Yea, that'd be amazing. It probably should be something every
> execGrouping.c using node can opt into.
Do you think it should be implemented in execGrouping/TupleHashTableData (as I
did) ? I also did an experiment moving into the higher level nodes, but I
guess that's not actually desirable. There's currently different output from
tests between the implementation using execGrouping.c and the one outside it,
so there's at least an issue with grouping sets.
> > + hashtable->hinstrument.nbuckets_original = nbuckets;
> > + hashtable->hinstrument.nbuckets = nbuckets;
> > + hashtable->hinstrument.space_peak = entrysize * hashtable->hashtab->size;
>
> That's not actually an accurate accounting of memory, because for filled
> entries a lot of memory is used to store actual tuples:
Thanks - I think I finally understood this.
I updated some existing tests to show the new output. I imagine that's a
throwaway commit, and should eventually add new tests for each of these node
types under explain analyze.
I've been testing the various nodes like:
--heapscan:
DROP TABLE t; CREATE TABLE t (i int unique) WITH(autovacuum_enabled=off); INSERT INTO t SELECT generate_series(1,99999); SET enable_seqscan=off; SET parallel_tuple_cost=0; SET parallel_setup_cost=0; SET enable_indexonlyscan=off; explain analyze verbose SELECT * FROM t WHERE i BETWEEN 999 and 99999999;
--setop:
explain( analyze,verbose) SELECT * FROM generate_series(1,999) EXCEPT (SELECT NULL UNION ALL SELECT * FROM generate_series(1,99999));
Buckets: 2048 (originally 256) Memory Usage: hashtable: 48kB, tuples: 8Kb
--recursive union:
explain analyze verbose WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION SELECT n || ' bar' FROM t WHERE length(n) < 9999) SELECT n, n IS OF (text) AS is_text FROM t;
--subplan
explain analyze verbose SELECT i FROM generate_series(1,999)i WHERE (i,i) NOT IN (SELECT 1,1 UNION ALL SELECT j,j FROM generate_series(1,99999)j);
Buckets: 262144 (originally 131072) Memory Usage: hashtable: 6144kB, tuples: 782Kb
explain analyze verbose select i FROM generate_series(1,999)i WHERE(1,i) NOT in (select i,null::int from t) ;
--Agg:
explain (analyze,verbose) SELECT A,COUNT(1) FROM generate_series(1,99999)a GROUP BY 1;
Buckets: 262144 (originally 256) Memory Usage: hashtable: 6144kB, tuples: 782Kb
explain (analyze, verbose) select i FROM generate_series(1,999)i WHERE(1,1) not in (select a,null from (SELECT generate_series(1,99999) a)x) ;
explain analyze verbose select * from (SELECT a FROM generate_series(1,99)a)v left join lateral (select v.a, four, ten, count(*) from (SELECT b four, 2 ten, b FROM generate_series(1,999)b)x group by cube(four,ten)) s on true order by v.a,four,ten;
--Grouping sets:
explain analyze verbose select unique1,
count(two), count(four), count(ten),
count(hundred), count(thousand), count(twothousand),
count(*)
from tenk1 group by grouping sets (unique1,twothousand,thousand,hundred,ten,four,two);
--
Justin
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Run-some-existing-tests-with-explain-ANALYZE.patch | text/x-diff | 48.5 KB |
v2-0002-explain-to-show-tuplehash-bucket-and-memory-stats.patch | text/x-diff | 27.7 KB |
v2-0003-Gross-hack-to-put-hash-stats-of-subplans-in-the-r.patch | text/x-diff | 7.1 KB |
v2-0004-implement-hash-stats-for-bitmapHeapScan.patch | text/x-diff | 5.5 KB |
v2-0005-Refactor-for-consistency-symmetry.patch | text/x-diff | 14.7 KB |
v2-0006-TupleHashTable.entrysize-was-unused-except-for-in.patch | text/x-diff | 1.6 KB |
v2-0007-Update-comment-obsolete-since-69c3936a.patch | text/x-diff | 871 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Cramer | 2020-02-16 00:50:40 | Re: New messages from Priscilla Ip |
Previous Message | Andrew Dunstan | 2020-02-15 22:40:02 | Re: Just for fun: Postgres 20? |