Quick Links

Re: PoC: Partial sort

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Andreas Karlsson <andreas(at)proxel(dot)se>
Cc:	Jeremy Harris <jgh(at)wizmail(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PoC: Partial sort
Date:	2014-01-20 12:43:27
Message-ID:	CAPpHfdsiRPaqn8DTty2DywkuOrXJJcJBQUiNy9Ossm1LDfjXwQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Jan 19, 2014 at 5:57 AM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:

> On 01/18/2014 08:13 PM, Jeremy Harris wrote:
>
>> On 31/12/13 01:41, Andreas Karlsson wrote:
>>
>>> On 12/29/2013 08:24 AM, David Rowley wrote:
>>>
>>>> If it was possible to devise some way to reuse any
>>>> previous tuplesortstate perhaps just inventing a reset method which
>>>> clears out tuples, then we could see performance exceed the standard
>>>> seqscan -> sort. The code the way it is seems to lookup the sort
>>>> functions from the syscache for each group then allocate some sort
>>>> space, so quite a bit of time is also spent in palloc0() and pfree()
>>>>
>>>> If it was not possible to do this then maybe adding a cost to the number
>>>> of sort groups would be better so that the optimization is skipped if
>>>> there are too many sort groups.
>>>>
>>>
>>> It should be possible. I have hacked a quick proof of concept for
>>> reusing the tuplesort state. Can you try it and see if the performance
>>> regression is fixed by this?
>>>
>>> One thing which have to be fixed with my patch is that we probably want
>>> to close the tuplesort once we have returned the last tuple from
>>> ExecSort().
>>>
>>> I have attached my patch and the incremental patch on Alexander's patch.
>>>
>>
>> How does this work in combination with randomAccess ?
>>
>
> As far as I can tell randomAccess was broken by the partial sort patch
> even before my change since it would not iterate over multiple tuplesorts
> anyway.
>
> Alexander: Is this true or am I missing something?

Yes, I decided that Sort node shouldn't provide randomAccess in the case of
skipCols !=0. See assert in the beginning of ExecInitSort. I decided that
it would be better to add explicit materialize node rather than store extra
tuples in tuplesortstate each time.
I also adjusted ExecSupportsMarkRestore, ExecMaterializesOutput and
ExecMaterializesOutput to make planner believe so. I found path->pathtype
to be absolutely never T_Sort. Correct me if I'm wrong.

Another changes in this version of patch:
1) Applied patch to don't compare skipCols in tuplesort by Marti Raudsepp
2) Adjusting sort bound after processing buckets.

------
With best regards,
Alexander Korotkov.

Attachment	Content-Type	Size
partial-sort-6.patch.gz	application/x-gzip	17.0 KB

In response to

Re: PoC: Partial sort at 2014-01-19 01:57:03 from Andreas Karlsson

Responses

Re: PoC: Partial sort at 2014-01-26 19:03:55 from Marti Raudsepp

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2014-01-20 12:55:06	Re: plpgsql.warn_shadow
Previous Message	Marko Tiikkaja	2014-01-20 12:16:56	Re: plpgsql.warn_shadow