Re: Extremely slow HashAggregate in simple UNION query

From: Felix Geisendörfer <felix(at)felixge(dot)de>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Extremely slow HashAggregate in simple UNION query
Date: 2019-08-20 17:55:56
Message-ID: 87FEADEA-C74D-470B-9EC8-4C1CED040C25@felixge.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

> On 20. Aug 2019, at 19:32, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2019-08-20 17:11:58 +0200, Felix Geisendörfer wrote:
>>
>> HashAggregate (cost=80020.01..100020.01 rows=2000000 width=8) (actual time=19.349..23.123 rows=1 loops=1)
>
> FWIW, that's not a mis-estimate I'm getting on master ;). Obviously
> that doesn't actually address your concern...

I suppose this is thanks to the new optimizer support functions
mentioned by Michael and Pavel?

Of course I'm very excited about those improvements, but yeah, my
real query is mis-estimating things for totally different reasons not
involving any SRFs.

>> I'm certainly a novice when it comes to PostgreSQL internals, but I'm
>> wondering if this could be fixed by taking a more dynamic approach for
>> allocating HashAggregate hash tables?
>
> Under-sizing the hashtable just out of caution will have add overhead to
> a lot more common cases. That requires copying data around during
> growth, which is far far from free. Or you can use hashtables that don't
> need to copy, but they're also considerably slower in the more common
> cases.

How does PostgreSQL currently handle the case where the initial hash
table is under-sized due to the planner having underestimated things?
Are the growth costs getting amortized by using an exponential growth
function?

Anyway, I can accept my situation to be an edge case that doesn't justify
making things more complicated.

>> 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of the query execution instead of 45ms [5].
>
> Well, there's plenty work that's not attributed to nodes. IIRC we don't
> track executor startup/shutdown overhead on a per-node basis.

I didn't know that, thanks for clarifying : ).

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2019-08-20 18:27:17 Re: Extremely slow HashAggregate in simple UNION query
Previous Message Andres Freund 2019-08-20 17:32:19 Re: Extremely slow HashAggregate in simple UNION query