Re: Postgres12 looking for possible HashAggregate issue workarounds?

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: João Paulo Luís <joao(dot)luis(at)pdmfc(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Postgres12 looking for possible HashAggregate issue workarounds?
Date: 2022-12-18 11:06:30
Message-ID: CAApHDvogcaqa3JU9u06SP3Eh155D0wGtzy+d-x6XfN07bJ270g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, 18 Dec 2022 at 23:44, João Paulo Luís <joao(dot)luis(at)pdmfc(dot)com> wrote:
> Meanwhile, as a one-time workaround I've disabled the hashagg algorithm,

The way the query planner determines if Hash Aggregate's hash table
will fit in work_mem or not is based on the n_distinct estimate of the
columns being grouped on. You may want to review what analyze set
n_distinct to on this table. That can be done by looking at:

select attname,n_distinct from pg_Stats where tablename =
'sentencesource' and attname = 'sentence';

If what that's set to does not seem realistic, then you can overwrite this with:

ALTER TABLE sentencesource ALTER COLUMN sentence SET (n_distinct = N);

Please see the paragraph in [1] about n_distinct. Using an absolute
value is likely not a great idea if the table is going to grow. You
could maybe give it a better estimate about how many times values are
repeated by setting some negative value, as described in the
documents. You'll need to analyze the table again after changing this
setting.

David

[1] https://www.postgresql.org/docs/12/sql-altertable.html

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Frits Jalvingh 2022-12-18 11:48:03 temp_file_limit?
Previous Message João Paulo Luís 2022-12-16 17:47:07 RE: Postgres12 looking for possible HashAggregate issue workarounds?