From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Sam Mason <sam(at)samason(dot)me(dot)uk> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Sort Refinement |
Date: | 2008-03-22 13:01:07 |
Message-ID: | 1206190867.4285.758.camel@ebony.site |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2008-03-20 at 21:34 +0000, Sam Mason wrote:
> On Thu, Mar 20, 2008 at 05:17:22PM +0000, Simon Riggs wrote:
> > Currently, our sort algorithm assumes that its input is unsorted. So if
> > your data is sorted on (a) and you would like it to be sorted on (a,b)
> > then we need to perform the full sort of (a,b).
> >
> > For small sorts this doesn't matter much. For larger sorts the heap sort
> > algorithm will typically result in just a single run being written to
> > disk which must then be read back in. Number of I/Os required is twice
> > the total volume of data to be sorted.
> >
> > If we assume we use heap sort, then if we *know* that the data is
> > presorted on (a) then we should be able to emit tuples directly that the
> > value of (a) changes and keep emitting them until the heap is empty,
> > since they will exit the heap in (a,b) order.
>
> We also have stats to help decide when this will be a win. For example
> if "a" has a small range (i.e. a boolean) and "b" has a large range
> (i.e. some sequence) then this probably isn't going to be a win and
> you're better off using the existing infrastructure. If it's the other
> way around then this is going to be a big win.
Yep, sounds sensible.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk
From | Date | Subject | |
---|---|---|---|
Next Message | Mihai Criveti | 2008-03-22 20:40:15 | Building PostgreSQL 8.3.1 on OpenVMS 8.3 AXP |
Previous Message | Simon Riggs | 2008-03-22 12:59:19 | Re: Sort Refinement |