Quick Links

Re: fast DISTINCT or EXIST

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Tilo Buschmann <mailinglist(dot)postgresql(dot)performance(at)b-n-w(dot)org>
Cc:	Arjen van der Meijden <acmmailing(at)tweakers(dot)net>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: fast DISTINCT or EXIST
Date:	2007-04-07 16:39:38
Message-ID:	9915.1175963978@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Tilo Buschmann <mailinglist(dot)postgresql(dot)performance(at)b-n-w(dot)org> writes:
>> Arjen van der Meijden <acmmailing(at)tweakers(dot)net> writes:
>>> SELECT ... FROM cd
>>> JOIN tracks ...
>>> WHERE cd.id IN (SELECT DISTINCT cd_id FROM (SELECT t.cd_id FROM tracks t
>>> WHERE t.tstitle @@ plainto_tsquery('simple','education') LIMIT 30)
>>> as foo LIMIT 10)

> Unfortunately, the query above will definitely not work correctly, if
> someone searches for "a" or "the".

Well, the "incorrectness" is only that it might deliver fewer than the
hoped-for ten CDs ... but that was a completely arbitrary cutoff anyway,
no? I think in practice this'd give perfectly acceptable results.

> Actually, I hoped to find an alternative, that does not involve
> DISTINCT.

You could try playing around with GROUP BY rather than DISTINCT; those
are separate code paths and will probably give you different plans.
But I don't think you'll find that GROUP BY does any better on this
particular measure of yielding rows before the full input has been
scanned.

regards, tom lane

In response to

Re: fast DISTINCT or EXIST at 2007-04-07 16:24:07 from Tilo Buschmann

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Arjen van der Meijden	2007-04-07 17:28:52	Re: fast DISTINCT or EXIST
Previous Message	Tilo Buschmann	2007-04-07 16:24:07	Re: fast DISTINCT or EXIST