From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Spilling hashed SetOps and aggregates to disk |
Date: | 2018-06-05 17:52:09 |
Message-ID: | 20180605175209.vavuqe4idovcpeie@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2018-06-05 10:47:49 -0700, Jeff Davis wrote:
> The thing I don't like about it is that it requires running two memory-
> hungry operations at once. How much of work_mem do we use for sorted
> runs, and how much do we use for the hash table?
Is that necessarily true? I'd assume that we'd use a small amount of
memory for the tuplesort, enough to avoid unnecessary disk spills for
each tuple. But a few kb should be enough - think it's fine to
aggressively spill to disk, we after all already have handled the case
of smaller number of input rows. Then at the end of the run, we empty
out the hashtable, and free it. Only then we do to the sort.
One thing this wouldn't handle are datatypes that support hashing, but
no sorting. Not exactly common.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2018-06-05 17:53:07 | Re: Code of Conduct plan |
Previous Message | David Fetter | 2018-06-05 17:51:14 | Re: [PATCH] Trim trailing whitespace in vim and emacs |