Quick Links

Re: Performance pb vs SQLServer.

From:	John Arbash Meinel <john(at)arbash-meinel(dot)com>
To:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Performance pb vs SQLServer.
Date:	2005-08-15 01:05:58
Message-ID:	42FFEA76.6070109@arbash-meinel.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Steinar H. Gunderson wrote:

>On Sun, Aug 14, 2005 at 07:27:38PM -0500, John Arbash Meinel wrote:
>
>
>>My guess is that this is part of a larger query. There isn't really much
>>you can do. If you want all 3.2M rows, then you have to wait for them to
>>be pulled in.
>>
>>
>
>To me, it looks like he'll get 88 rows, not 3.2M. Surely we must be able to
>do something better than a full sequential scan in this case?
>
>test=# create table foo ( bar char(4) );
>CREATE TABLE
>test=# insert into foo values ('0000');
>INSERT 24773320 1
>test=# insert into foo values ('0000');
>INSERT 24773321 1
>test=# insert into foo values ('1111');
>INSERT 24773322 1
>test=# select * from foo group by bar;
> bar
>------
> 1111
> 0000
>(2 rows)
>
>I considered doing some odd magic with generate_series() and subqueries with
>LIMIT 1, but it was a bit too weird in the end :-)
>
>/* Steinar */
>
>
I think a plain "GROUP BY" is not smart enough to detect it doesn't need
all rows (since it is generally used because you want to get aggregate
values of other columns).
I think you would want something like SELECT DISTINCT, possibly with an
ORDER BY rather than a GROUP BY (which was my final suggestion).

John
=:->

In response to

Re: Performance pb vs SQLServer. at 2005-08-15 01:01:59 from Steinar H. Gunderson

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2005-08-15 01:18:45	Re: Performance pb vs SQLServer.
Previous Message	Steinar H. Gunderson	2005-08-15 01:04:15	Re: Performance pb vs SQLServer.