Re: count of occurences PLUS optimisation

From: "Thurstan R(dot) McDougle" <trmcdougle(at)my-deja(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: count of occurences PLUS optimisation
Date: 2001-09-13 16:38:56
Message-ID: 3BA0E120.244F64A9@my-deja.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Martijn van Oosterhout wrote:
>
> On Thu, Sep 13, 2001 at 01:03:54PM +0100, Thurstan R. McDougle wrote:
> > It would depend on how efficient the ORDER BY and LIMIT work together.
> > (The ORDER BY could build a list of LIMIT n items and just replace items
> > in that list...a lot more efficient both of memory and comparisons than
> > building the full list and then keeping the top n)
>
> There is some of this already. In the output of EXPLAIN you see two numbers.
> The first is the estimated time toget the first tuple, the second is to get
> all the tuples.
>
> When LIMIT is applied, the estimated total cost is adjusted based on the
> number of rows. So with a small number of tuples the planner will favour
> plans that get tuples early even if the total cost would be larger.
>
> > HACKERS: If it does not do this it might be a usefull optimisation.
> > There would probably need to be a cutoff limit on whether to apply this
> > method or sort and keep n. Also for LIMIT plus OFFSET it would need to
> > build a list of the the total of the LIMIT and OFFSET figures.
>
> The problem is that it sometimes doesn't help as much as you'd expect. If
> you see a Sort stage in the plan, that means that everything below that has
> to be completly calculated.
>
> The only solution is to use a sorted index to avoid the sort step, if
> possible.

What I am talking about is WHEN the sort is required we could make the
sort more efficient as inserting into a SHORT ordered list should be
better than building a BIG list and sorting it, then only keeping a
small part of the list.

In the example in question there would be perhaps 400 records, but only
10 are needed. From the questions on these lists it seems quite common
for only a very low proportion of the records to be required (less then
10%/upto 100 typically), in these cases it would seem to be a usefull
optimisation.

>
> HTH,
> --
> Martijn van Oosterhout <kleptog(at)svana(dot)org>
> http://svana.org/kleptog/
> > Magnetism, electricity and motion are like a three-for-two special offer:
> > if you have two of them, the third one comes free.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

--
This is the identity that I use for NewsGroups. Email to
this will just sit there. If you wish to email me replace
the domain with knightpiesold . co . uk (no spaces).

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Krog, Kenneth 2001-09-13 16:45:30 Working with dates and queries
Previous Message Peter Eisentraut 2001-09-13 16:21:24 Re: business perspective

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-09-13 17:00:26 Re: Escaping strings for inclusion into SQL queriesh
Previous Message Stephan Szabo 2001-09-13 16:32:30 Re: Need feedback: GeneXus will support PostgreSQL