From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Default setting for enable_hashagg_disk |
Date: | 2020-06-25 20:36:29 |
Message-ID: | 20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
Hi,
On 2020-06-25 14:25:12 -0400, Bruce Momjian wrote:
> I am still trying to get my head around why the spill is going to be so
> much work to adjust for hash agg than our other spillable nodes.
Aggregates are the classical case used to process large amounts of
data. For larger amounts of data sorted input (be it via explicit sort
or ordered index scan) isn't an attractive option. IOW hash-agg is the
common case. There's also fewer stats for halfway accurately estimating
the number of groups and the size of the transition state - a sort /
hash join doesn't have an equivalent to the variably sized transition
value.
> What are people doing for those cases already? Do we have an
> real-world queries that are a problem in PG 13 for this?
I don't know about real world, but it's pretty easy to come up with
examples.
query:
SELECT a, array_agg(b) FROM (SELECT generate_series(1, 10000)) a(a), (SELECT generate_series(1, 10000)) b(b) GROUP BY a HAVING array_length(array_agg(b), 1) = 0;
work_mem = 4MB
12 18470.012 ms
HEAD 44635.210 ms
HEAD causes ~2.8GB of file IO, 12 doesn't cause any. If you're IO
bandwidth constrained, this could be quite bad.
Obviously this is contrived, and a pretty extreme case. But if you
imagine this happening on a system where disk IO isn't super fast
(e.g. just about any cloud provider).
An even more extreme version of the above is this:
query: SELECT a, array_agg(b) FROM (SELECT generate_series(1, 50000)) a(a), (SELECT generate_series(1, 10000)) b(b) GROUP BY a HAVING array_length(array_agg(b), 1) = 0;
work_mem = 16MB
12 81598.965 ms
HEAD 210772.360 ms
temporary tablespace on magnetic disk (raid 0 of two 7.2k server
spinning disks)
12 81136.530 ms
HEAD 225182.560 ms
The disks are busy in some periods, but still keep up. If I however make
the transition state a bit bigger:
query: SELECT a, array_agg(b), count(c), max(d),max(e) FROM (SELECT generate_series(1, 10000)) a(a), (SELECT generate_series(1, 5000)::text, repeat(random()::text, 10), repeat(random()::text, 10), repeat(random()::text, 10)) b(b,c,d,e) GROUP BY a HAVING array_length(array_agg(b), 1) = 0;
12 28164.865 ms
fast ssd:
HEAD 92520.680 ms
magnetic:
HEAD 183968.538 ms
(no reads, there's plenty enough memory. Just writes because the age /
amount thresholds for dirty data are reached)
In the magnetic case we're IO bottlenecked nearly the whole time.
Just to be clear: I think this is a completely over-the-top example. But
I do think it shows the problem to some degree at least.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2020-06-25 21:28:02 | Re: Default setting for enable_hashagg_disk |
Previous Message | Alvaro Herrera | 2020-06-25 20:02:12 | Re: Please add a link to [BRIN] physical block ranges of a table |
From | Date | Subject | |
---|---|---|---|
Next Message | Alastair McKinley | 2020-06-25 20:47:30 | Re: CUBE_MAX_DIM |
Previous Message | Tom Lane | 2020-06-25 19:57:00 | Re: should libpq also require TLSv1.2 by default? |