From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Spilling hashed SetOps and aggregates to disk |
Date: | 2018-06-11 18:13:45 |
Message-ID: | 1528740825.8818.52.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 2018-06-11 at 19:33 +0200, Tomas Vondra wrote:
> For example we hit the work_mem limit after processing 10% tuples,
> switching to sort would mean spill+sort of 900GB of data. Or we
> might
> say - hmm, we're 10% through, so we expect hitting the limit 10x, so
> let's spill the hash table and then do sort on that, writing and
> sorting
> only 10GB of data. (Or merging it in some hash-based way, per
> Robert's
> earlier message.)
Your example depends on large groups and a high degree of group
clustering. That's fine, but it's a special case, and complexity does
have a cost, too.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2018-06-11 18:31:11 | Re: CF bug fix items |
Previous Message | Tom Lane | 2018-06-11 17:59:35 | Re: Spilling hashed SetOps and aggregates to disk |