From: | "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Steve Atkins <steve(at)blighty(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Performance of count(*) |
Date: | 2007-03-22 19:18:24 |
Message-ID: | 4602D680.2020106@modgraph-usa.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom Lane wrote:
> "Craig A. James" <cjames(at)modgraph-usa(dot)com> writes:
>> Steve Atkins wrote:
>>> As long as you're ordering by some row in the table then you can do that in
>>> straight SQL.
>>>
>>> select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
>>>
>>> Then, record the last value of foo you read, and plug it in as X the next
>>> time around.
>
>> We've been over this before in this forum: It doesn't work as advertised.
>> Look for postings by me regarding the fact that there is no way to tell
>> the optimizer the cost of executing a function. There's one, for example,
>> on Oct 18, 2006.
>
> You mean
> http://archives.postgresql.org/pgsql-performance/2006-10/msg00283.php
> ? I don't see anything there that bears on Steve's suggestion.
> (The complaint is obsolete as of CVS HEAD anyway.)
Mea culpa, it's October 8, not October 18:
http://archives.postgresql.org/pgsql-performance/2006-10/msg00143.php
The relevant part is this:
"My example, discussed previously in this forum, is a classic. I have a VERY expensive function (it's in the class of NP-complete problems, so there is no faster way to do it). There is no circumstance when my function should be used as a filter, and no circumstance when it should be done before a join. But PG has no way of knowing the cost of a function, and so the optimizer assigns the same cost to every function. Big disaster.
"The result? I can't use my function in any WHERE clause that involves any other conditions or joins. Only by itself. PG will occasionally decide to use my function as a filter instead of doing the join or the other WHERE conditions first, and I'm dead.
"The interesting thing is that PG works pretty well for me on big tables -- it does the join first, then applies my expensive functions. But with a SMALL (like 50K rows) table, it applies my function first, then does the join. A search that completes in 1 second on a 5,000,000 row database can take a minute or more on a 50,000 row database."
Craig
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-03-22 19:40:33 | Re: Performance of count(*) |
Previous Message | Michael Stone | 2007-03-22 18:46:15 | Re: Parallel Vacuum |