From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Spilling hashed SetOps and aggregates to disk |
Date: | 2018-06-05 17:47:49 |
Message-ID: | 1528220869.2742.58.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2018-06-05 at 05:42 -0700, Andres Freund wrote:
> That's an interesting idea, but it seems simpler to stick to
> > hashing
> > rather than using a combination strategy. It also seems like it
> > would
> > take less CPU effort.
> Isn't the locality of access going to considerably better with the
> sort
> based approach?
I think I see what you mean, but without measuring it's hard to say.
You may be able to achieve similar locality in the hashing approach by
using a smaller hash table -- an effect that I think I observed
previously but I'd have to rerun the numbers.
> >
> > What advantages do you have in mind? My patch partitions the
> > spilled
> > data, so it should have similar disk costs as a sort approach.
> I think one part of it is that I think the amount of code is going to
> be
> lower - we essentially have already all the code to handle sort based
> aggs, and to have both sort and hash based aggs in the same query.
> We'd
> mostly need a way to scan the hashtable and stuff it into a
> tuplesort,
> that's not hard. nodeAgg.c is already more than complex enough, I'm
> not
> sure that full blown partitioning is worth the cost.
The thing I like about your idea is that we wouldn't need to make a
choice at plan time, we just always do the hybrid hashing/sorting
unless we know that it's already sorted.
The thing I don't like about it is that it requires running two memory-
hungry operations at once. How much of work_mem do we use for sorted
runs, and how much do we use for the hash table?
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2018-06-05 17:51:14 | Re: [PATCH] Trim trailing whitespace in vim and emacs |
Previous Message | Andres Freund | 2018-06-05 17:46:12 | Re: Variable-length FunctionCallInfoData |