join question

From: Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>
To: "Postgres General List" <pgsql-general(at)postgresql(dot)org>
Subject: join question
Date: 2008-10-22 21:52:04
Message-ID: 2f4958ff0810221452qabafe52p4089278804a8cfcb@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hey folks,
I am trying to rewrite a query here, that takes 1.5m atm to finish. I got it
down to 20s, and still trying to pin it down.

basically, a query looks something like that atm:

select a.*, b.*
from a
join b on a.id = b.a_id and a.banned <> true
where
a.start <= now()
and
b.end > now();

that's 20s query, and now I got it down to 10s , by using something - which
in my eyes would be always wrong - and against all logic. So if someone
could please explain to me why is it faster:

select a.*, b.*
from foo a
join bar b on a.id = b.a_id
where
not exists (
select id from foo where foo.id = b.a_id and foo.banned <> true
)
and
a.start <= now()
and
b.end > now();

plans differ, obviously - second one uses index to lookup .banned in
foo, whilst first one goes for seq scan.
result is the same, but I was actually expecting quite opposite. So is join
on 1-2M rows a bad idea ?
The effect can be seen on both 8.1 and cvs head.

I would be grateful for someone clarifying that to me.

--
GJ

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Guillaume Lelarge 2008-10-22 21:56:16 Re: triggers problems whit function
Previous Message Tom Lane 2008-10-22 21:36:00 Re: how to split coordinates from point