From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Andreas Karlsson <andreas(at)proxel(dot)se> |
Cc: | Jeremy Harris <jgh(at)wizmail(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PoC: Partial sort |
Date: | 2014-01-20 12:43:27 |
Message-ID: | CAPpHfdsiRPaqn8DTty2DywkuOrXJJcJBQUiNy9Ossm1LDfjXwQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jan 19, 2014 at 5:57 AM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> On 01/18/2014 08:13 PM, Jeremy Harris wrote:
>
>> On 31/12/13 01:41, Andreas Karlsson wrote:
>>
>>> On 12/29/2013 08:24 AM, David Rowley wrote:
>>>
>>>> If it was possible to devise some way to reuse any
>>>> previous tuplesortstate perhaps just inventing a reset method which
>>>> clears out tuples, then we could see performance exceed the standard
>>>> seqscan -> sort. The code the way it is seems to lookup the sort
>>>> functions from the syscache for each group then allocate some sort
>>>> space, so quite a bit of time is also spent in palloc0() and pfree()
>>>>
>>>> If it was not possible to do this then maybe adding a cost to the number
>>>> of sort groups would be better so that the optimization is skipped if
>>>> there are too many sort groups.
>>>>
>>>
>>> It should be possible. I have hacked a quick proof of concept for
>>> reusing the tuplesort state. Can you try it and see if the performance
>>> regression is fixed by this?
>>>
>>> One thing which have to be fixed with my patch is that we probably want
>>> to close the tuplesort once we have returned the last tuple from
>>> ExecSort().
>>>
>>> I have attached my patch and the incremental patch on Alexander's patch.
>>>
>>
>> How does this work in combination with randomAccess ?
>>
>
> As far as I can tell randomAccess was broken by the partial sort patch
> even before my change since it would not iterate over multiple tuplesorts
> anyway.
>
> Alexander: Is this true or am I missing something?
Yes, I decided that Sort node shouldn't provide randomAccess in the case of
skipCols !=0. See assert in the beginning of ExecInitSort. I decided that
it would be better to add explicit materialize node rather than store extra
tuples in tuplesortstate each time.
I also adjusted ExecSupportsMarkRestore, ExecMaterializesOutput and
ExecMaterializesOutput to make planner believe so. I found path->pathtype
to be absolutely never T_Sort. Correct me if I'm wrong.
Another changes in this version of patch:
1) Applied patch to don't compare skipCols in tuplesort by Marti Raudsepp
2) Adjusting sort bound after processing buckets.
------
With best regards,
Alexander Korotkov.
Attachment | Content-Type | Size |
---|---|---|
partial-sort-6.patch.gz | application/x-gzip | 17.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-01-20 12:55:06 | Re: plpgsql.warn_shadow |
Previous Message | Marko Tiikkaja | 2014-01-20 12:16:56 | Re: plpgsql.warn_shadow |