From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Subject: | Re: Trouble with hashagg spill I/O pattern and costing |
Date: | 2020-05-21 14:30:40 |
Message-ID: | 20200521143040.zgn7ealdasfkmfcb@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 21, 2020 at 03:41:22PM +0200, Tomas Vondra wrote:
>On Tue, May 19, 2020 at 05:12:02PM +0200, Tomas Vondra wrote:
>>
>>...
>>
>>The problem is that the hashagg plan runs in ~1400 seconds, while the
>>groupagg only takes ~360. And per explain analyze, the difference really
>>is in the aggregation - if we subtract the seqscan, the sort+groupagg
>>takes about 310s:
>>
>> -> GroupAggregate (cost=41772791.17..43305665.51 rows=6206695 width=36) (actual time=283378.004..335611.192 rows=6398981 loops=1)
>> Group Key: lineitem_1.l_partkey
>> -> Sort (cost=41772791.17..42252715.81 rows=191969856 width=9) (actual time=283377.977..306182.393 rows=191969841 loops=1)
>> Sort Key: lineitem_1.l_partkey
>> Sort Method: external merge Disk: 3569544kB
>> -> Seq Scan on lineitem lineitem_1 (cost=0.00..5519079.56 rows=191969856 width=9) (actual time=0.019..28253.076 rows=192000551 loops=1)
>>
>>while the hashagg takes ~1330s:
>>
>> -> HashAggregate (cost=13977751.34..15945557.39 rows=6206695 width=36) (actual time=202952.170..1354546.897 rows=6400000 loops=1)
>> Group Key: lineitem_1.l_partkey
>> Planned Partitions: 128
>> Peak Memory Usage: 4249 kB
>> Disk Usage: 26321840 kB
>> HashAgg Batches: 16512
>> -> Seq Scan on lineitem lineitem_1 (cost=0.00..5519079.56 rows=191969856 width=9) (actual time=0.007..22205.617 rows=192000551 loops=1)
>>
>>And that's while only writing 26GB, compared to 35GB in the sorted plan,
>>and with cost being ~16M vs. ~43M (so roughly inverse).
>>
>
>I've noticed I've actually made a mistake here - it's not 26GB vs. 35GB
>in hash vs. sort, it's 26GB vs. 3.5GB. That is, the sort-based plan
>writes out *way less* data to the temp file.
>
>The reason is revealed by explain verbose:
>
> -> GroupAggregate
> Output: lineitem_1.l_partkey, (0.2 * avg(lineitem_1.l_quantity))
> Group Key: lineitem_1.l_partkey
> -> Sort
> Output: lineitem_1.l_partkey, lineitem_1.l_quantity
> Sort Key: lineitem_1.l_partkey
> -> Seq Scan on public.lineitem lineitem_1
> Output: lineitem_1.l_partkey, lineitem_1.l_quantity
>
> -> HashAggregate
> Output: lineitem_1.l_partkey, (0.2 * avg(lineitem_1.l_quantity))
> Group Key: lineitem_1.l_partkey
> -> Seq Scan on public.lineitem lineitem_1
> Output: lineitem_1.l_orderkey, lineitem_1.l_partkey,
> lineitem_1.l_suppkey, lineitem_1.l_linenumber,
> lineitem_1.l_quantity, lineitem_1.l_extendedprice,
> lineitem_1.l_discount, lineitem_1.l_tax,
> lineitem_1.l_returnflag, lineitem_1.l_linestatus,
> lineitem_1.l_shipdate, lineitem_1.l_commitdate,
> lineitem_1.l_receiptdate, lineitem_1.l_shipinstruct,
> lineitem_1.l_shipmode, lineitem_1.l_comment
>
>It seems that in the hashagg case we're not applying projection in the
>seqscan, forcing us to serialize way much data (the whole lineitem
>table, essentially).
>
>It's probably still worth tweaking the I/O pattern, I think.
>
OK, it seems the attached trivial fix (simply changing CP_LABEL_TLIST to
CP_SMALL_TLIST) addresses this for me. I've only tried it on the patched
version that pre-allocates 128 blocks, and the results seem pretty nice:
sort hash hash+tlist
------------------------------------------
4MB 331 478 188
128MB 222 434 210
which I guess is what we wanted ...
I'll give it a try on the other machine (temp on SATA), but I don't see
why would it not behave similarly nicely.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
agg-tlist-fix.patch | text/plain | 638 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-05-21 14:30:50 | Re: Behaviour of failed Primary |
Previous Message | Pavel Borisov | 2020-05-21 14:25:46 | Re: [PATCH] fix GIN index search sometimes losing results |