Quick Links

Re: How slow is DISTINCT?

From:	"Josh Berkus" <josh(at)agliodbs(dot)com>
To:	wweng(at)kencast(dot)com, pgsql-sql(at)postgresql(dot)org
Subject:	Re: How slow is DISTINCT?
Date:	2002-02-27 21:45:16
Message-ID:	web-812107@davinci.ethosmedia.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Wei Wang,

> How exactly slow is DISTINCT being processed in SQL engines? (not
> limited to postgresql, though comments on postgresql would be most
> relevant)

I can only give you a relative result, based exlusively on my anecdotal
experience with 7.1:

Fast: SELECT ...
Slower: SELECT ... GROUP BY x,y,z
or: SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
SLowest: SELECT DISTINCT ...

The reason for this is that SELECT DISTINCT is effectively a GROUP BY
on all result fields of the query, and if a few of the aren't indexed
that requires a seq scan.

If performance is an issue, you may wish to consider restructuring your
queries and/or data model to eliminate the actual duplicate rows.

-Josh

In response to

How slow is DISTINCT? at 2002-02-27 18:27:54 from Wei Weng

Responses

Re: How slow is DISTINCT? at 2002-02-27 22:57:46 from Wei Weng

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Josh Berkus	2002-02-27 22:54:39	Re: How slow is DISTINCT?
Previous Message	Dan Langille	2002-02-27 21:30:58	Re: How slow is DISTINCT?