From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Greg S <stark(at)mit(dot)edu> |
Subject: | Re: Using quicksort for every external sort run |
Date: | 2015-12-23 21:16:09 |
Message-ID: | CA+TgmoYPKjZcOPdwyjTik0iCBDO5Bd7rNKn3tNzYxVfEgyKvGQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 23, 2015 at 3:31 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Wed, Dec 23, 2015 at 9:37 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> The point is, nobody can tell WHAT effects this is modeling.
>> Increasing the tuple size makes the crossover go up. Or down.
>
> There are multiple, competing considerations.
Please explain what they are and how they lead you to believe that the
cost factors you have chosen are good ones.
My point here is: even if I were to concede that your cost model
yields perfect answers in every case, the patch needs to give at least
some hint as to why. Right now, it really doesn't.
>>> Another factor is that the heap could be useful for other stuff in the
>>> future. As Simon Riggs pointed out, for deduplicating values as
>>> they're read in by tuplesort. (Okay, that's really the only other
>>> thing, but it's a good one).
>>
>> Not sure how that would work?
>
> Tuplesort would have license to discard tuples with matching existing
> values, because the caller gave it permission to. This is something
> that you can easily imagine occurring with ordered set aggregates, for
> example. It would work in a way not unlike a top-N heapsort does
> today. This would work well when it can substantially lower the use of
> memory (initially heapification when the threshold is crossed would
> probably measure the number of duplicates, and proceed only when it
> looked like a promising strategy).
It's not clear to me how having a heap helps with that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-12-23 23:15:38 | Re: parallel joins, and better parallel explain |
Previous Message | Peter Geoghegan | 2015-12-23 21:08:29 | Re: Using quicksort for every external sort run |